
McGRAW-HILL 

INDUSTRIAL ORGANIZATION AND MANAGEMENT SERIES 
L. C. MoRito\r, Consulting Editor 

•k -k n 


PRINOIPLBS OP PEESONNEL TESTING 





McGRAW-HILL 

INDUSTRIAL ORGANIZATION AND MANAGEMENT SERIES 
L, C. Mokkov, C(msvltmg Editor 

* * * 


PRINCIPLES OP PERSONNEL TESTING 




McGRAW-HILL INDUSTRIAL ORGANIZATION 
AND MANAGEMENT SERIES 
L. G. Moiinow, Consulting Editor 

★ ★ ★ 

Bethel, Atwatee, and SftACKMAit— Industrial Organization 

and Management 

Bethel, Atwatee, Smith, and Stackman— EM snifolf of Induslrlal 
Managefnent 

CAnnOLL— How to Chart Timestudij Data 
Caueoll-Hou) to Control Production Costs 
FBiGBNOAVM—Qualitij Control: Principles, Practice, 
and Administration 

Finlay, Saetain, and Tate— H uman Behaolor in Industry 
GAEDNEE-Fro^t Management and Control 
GnANT-Statistical Quality Control 

HANNAmm-Conference Leadership in Business and Imlustry 
HsiDB-Industriid Process Control by Slatislical Methods 
RYOiB—Pundamentals of Successful Maimfaclurbxg 
Immeb— L at/out Flaming Tec/tniques 
JuoAn^Quality-control Handbook 
Kalsem— F racttca? Superolslon 

Kimball and KxuBAiXr-PrincIples of Indmtrial Organization 
LAUDY-Froducllon Planning and Control 
LAwsHE-PriflctpIoo of Personnel Testing 

Maynaed, Stegemerten, and ScawAn-Metbods-Time Measurement 

MicHAEii— Wage and Salary Fundamentals and Procedures 

NBUBCBBif-Streamlining Business Procedures 

Smyth ond Muiu>iiy— 7o& Evaluation and Employee Rating 

Staniae— P lant Engineering Handbook 

TooTLB-Employees Are People 

YooNO^Personnel Manual for Ezeoutloes 



PRINCIPLES OF 
PERSONNEL TESTING 


by 


C. H. LAWSHE, Jr. 


Profestor of Ptyohology, Purdue TJnivernty 


McGRAW'HILL BOOK COMPANY, INC. 

NEW YORK TORONTO LONDON 


194B 



PMNCIPLPIS OF PERaONNEL TESTINO 


CoPYUMHTi 1048, «T »PnFI 
McGiUAV^-IIitiii 13 ook Company, Inc* 

PniNTBD IN TUB Unitkd Statk 8 or Amrrica 

All righU tCRcnycd* This book, or 
parts Uicrcof, may not bo reproducod 
in any jorm imthout permission o) 

Iho publishers^ 

yill 



To 

MURIEL 



PREFACE 


The recent war years have clearly demonstrated the effective- 
ness of personnel tests both in industry and in tlie military serv- 
ices. That they are genuinely useful managerial tools is attested 
by the fact that psychological tests have been employed in some 
industries for aJmost two decades. However, the adoption of per- 
sonnel tests in business and industry has been retarded somewhat 
by the lack of trained personnel to administer testing programs 
and by the lack of information about tests on the part of those in 
managerial capacities. 

It is the hope of the author that this book will prove useful to 
those now in or soon to be in managerial positions as a statement 
of what can legitimately bo expected of tests and as a guide to the 
establishment of tho policy framework within which a testing 
program must function. It is further hoped that this book will be 
useful in the training of those who will eventually administer test- 
ing programs. As every applied psychologist knows, it is often 
too gi’eat a stop from the tlieory and perfectionism of the class- 
room to the reality and pragmatism of business and industry. Be- 
cause of this fact, emphasis has been placed upon procedure rather 
than theory and upon results rather than rationale. 

Acknowledgments are justly due those who contributed di- 
rectly and indirectly, especially to the author’s colleague and 
former teacher, Dr. Joseph TiflBn, who with the author inaugu- 
rated the Purdue Industrial Personnel Testing Institute and who 
collaborated on many of the research studies reported here; to 
Dr. F. B. Knight, the author’s immediate superior at the univer- 
sity, whose generous research policy has made possible many of 
the author’s investigations; and to A. C. Eckerman for wise coun- 
sel and permission to adapt material prepared by him for Chap. 
XIV and Appendix B. The author is indebted to Prof. Ronald 

Tli 




PREFACE 


A. Fiahor and Dr. Frank Yatoa, also to Oliver & Boyd Lkl., TSdin- 
burgh, for pci’inission to abridge Table IX from their book, Star- 
tistical Tables for Biological, Agricultural, and Medical Ro- 
search. Further acknowledgment is mailo of tho kind assistance 
of Max II. Forster, Williiuti II. AiigofF, II. A. Sherman, D. E. 
Cole, and Dr. Fi’ank Stiuup, each of whom renil critically all 
or poi't of tlio mmiuscript. Acknowledgment is made anony- 
moudy of the help of many persons in tho industrial field witli 
whom tho author hos hod consulting relationships, and whose 
questions and problems have contributed towai’d tho crystollisso- 
lion of tho author’s pouit of view here presented. To all of those, 
to tho many other friends too numerous to mention, and to the 
author’s secretary, Mrs. Elaino Bonnet, who processed the man- 
usci'ipt through its several stages, siiiccro thanks and apprecia- 
tion are extended. 

C. II. Lawsiud, Jn. 

WhBT LAPAYKra, In!>. 

February, 1048 



CONTENTS 


PAOE 


Preface vii 

aHAFTBR 

I. The Basis OF Personnel Testing 1 

II. Pbogbdtjkbs FOB Choosing Tests 10 

Present Employee Method — ^The Follow-up Method 
— Comparison of Methods — Summary. 

III. Measures of Job Success 20 


Production Data — Personnel Data — .Tiidgracnt of 
Others — How to Obtain Supervisory Ratings for Use 
as Test Criteria — Job Samples — ^The Single Vai’iablo 
— Summary. 

IV. Methods for Analyzing and PRBSBNTiNa Facts . 40 

The Scattergram — Tlie Method of Averages — ^The 
Method of Percentages — ^The Profile Method— Sum- 
mary. 

V. Mentaij Ability Tests 54 

Mental Ability and Job Placement— Sperafio Tests 
of Mental Ability — Summary. 

VI. Temperament and Personality Tests .... 76 

The Nature of Temperament — ^Theory of Tempera- 
ment Measurement — Specific Experiences with Tem- 
perament Tests — Summary. 

VII. Intbreeit and Preference Tests 87 

The Measurement of Interests — Occupational Group 
Differences — Ability Group Patterns— Limitations of 
Interest Tests— Summary. 

is 




X 


VOtfTBm'8 


oitAnai 

Vm. Visual Skill Tbbts 06 

The Nature of Vision — ^Monsurcmont of Visual Char- 
actorisUcs—The Validation of Vision Tosts— Tho Vis- 
ual Profile— Safely anil Vision— Suimnaiy. 

IX. Tests ron Mechanical and Otiieu Manual 

Workers 123 

Job Knowledge or Traile Tosts — Assemblers, Pack- 
ers, and Inspectors — Operators and Machine Attend- 
ers— Machine-tool Learners and Apprentices — Serv- 
ice Electricians and Repairmen — Other Trade Groups 
— Summary. 

X. Tests for CuaRiCAL and Other Office Employees 151 
Office Clerks — Typists — ^Stenographers — Oifioo Ma- 
chine Operators^ob and Eoctor Analysis — Sum- 
mary. 

XI. Tests for Salesmen and RETAiiz-samE Employees 162 
Insurance Salesmen — ^Miscellancotis Halcsmoii — Ro- 
toil-storo Personnel — Summary. 

XII. Tests iur Supervisory, Profbshional, ani> Execu- 


tive Personnel 171 

Tests for Supervisors — ^Professional Personnel— Ex- 
ecutive Personnel — Summary. 

Xin. How TO Construct a Test 177 


Preparing a Tost Budget — ^What a Worker Needs to 
Know — Constructing Items — ILom-onalysia Proce- 
dures — ^Using Item-analysis Results. 

XIV. iNAUQURATINa AND OpBRATINO A TbSTINQ IhlOORAM 192 
Tho Basic Procedure : Establish Ike PersonneUtesHng 
Policy — Introdiico the Program in the Plant — Iden- 
tify Jobs or Departments Having Personnel Problems 
— Obtain Job Descriptions for These Jobs — Select 
Tests for Tryout— Select CrUerion Groups Ad- 



CONTENTS ad 

FAOB 

minister Tests — Establish the Operating Test Pro- 
gram — Summary. 

Appendix 

A. Samplixto Theory and Practice 205 

B. Fundamentals op Test Administration . . . 213 

C. Commercially Available Tests ...... 216 

Index 221 




CHAPTER. I 

THE BASIS OF PERSONNEL TESTING 


Human beings differ. They differ in their physical attributes, 
their abilities, their temperaments, their interests, and their atti- 
tudes. Because they differ in these personality choracteristies, 
they naturally differ in the ways in which they perform their 
jobs. Some employees in a given group are better than others in 
spite of how good or how poor tlie group is os a whole. How great 
ore these differences; how are they distributed; and what has 
personnel testing to contribute to the employment or upgrading 
situation in tho light of tlnese facts? The purpose of this opening 
chapter is to deal specifically witli these questions. 

Production Output. — One company engaged in the manufac- 
turing of diamond dies which are used for extruding fine tungsten 
wire had eleven men on the job of drilling. These men were 
engaged in the process of drilling holes through industrial dia- 
monds. They were all selected because they showed promise of 
succeeding in that work. The obvious misfits, the floaters and 
ne'er-do-wells, and the disinterested were never hired. The 
eleven in question were chosen because, in terms of the usual 
employment procedures, they seemed to be good prospects. A 
later analysis of their production records, however, revealed 
important differences in thdr output. The table on the follow- 
ing page shows the number of dies that each drilled in a six week’s 
period. Note that one individual processed 741 dies while an- 
other processed only 288. The group average was approximately 
450; and if this is considered 100 per oent, the best man produced 
165 per cent, or 6S per cent above average, and the poorest man 
produced 64 per cent, or 36 per cent below average. Stated an- 
other way, the best employee produced about 2.6 times as much 
pa the poorest man in the group. It should be remembered that 

1 




3 PRINCIPLES OP PERSONNEL TSSTINO 

theso eleven incUvidunls woi’o oonsklcred equally good risks at the 
time of employment. 


Employoo A 

. , . . 741 dial 

Employee B 

.... 572 dies 

Employoo 0 
Employee I") 

. . . . 510 dies 
. . , . 4D8 dies 

Employee E 

. . , . 456 dies 

Employee F 

. . . . 436 dies 

Employee G 

. . 410 dias 

Employee 11 

.... 372 dies 

Employee I 

. . . . 342 dies 

Employee J 

. . , . 304 dies 

Employee K 

.... 288 dies 


Quality of Work. — ^Rmploycea differ not only in their rate of 
produotion but also in the (luality of their work. Forty-five 
employees in tlio toolroom of a company engai^ud in making a 
particular precision part for aircraft engines were each given 



Fm. l-l^Diatribution of aooumoy wioroa mwln l)y forty*«rivn toolroom 
pioyoea in xncnauring nmetcon pioccs of motal with vernier iiucromotorH. 

nineteen metal pieces of varying siscs and shapes and asked to 
measure them with standai'd vernier micrometers.^ Each part 
had been previously measured by ultraprcoision methods, and 
each employee was given an accuracy score that was the percent- 
age of pieces he measured correctly within 0.0001 inch of the 
“true” measurement. Figure 1-1 shows the distribution of those 

* ItAwsine, C. II., Jit., AND TinriH, Johkpii. Tho aetturaoy of pronirann iniitru- 
meat uoasuromont in induBlrial inapootion. J. appl. Ptyehol,, IMS, SO, 413-410. 


THE BASIS OF PERSONNEL TESTING 8 

scores. Kote that one man had an accuracy score bet^reen zero 
and 10 per cent, one other had a score between 10 and 20 per 
cent. Note also that two men had scores between 60 and 70 per 
cent and that Uie mean or average of tlie group was 43 per cent. 
The original data from which the figure was prepared indicate 
that the poorest performer had measured 5 per cent (actually 
only one) of the parts correctly while the best pei’former meas- 



Fia. l-2^Distribution of times required by forty->si!C order checkers to check 
a 8tandai*d order of grocorlca. 


ured 63 per cent, or twelve, of the nineteen pieces correctly. 
Note that the curve approaches the shape of a bell. 

Ways of Evaluating Job Performance. — There are many 
ways of evaluating job performance which will be treated sys- 
tematically and more thoroughly in Chap. III. For the time 
being it is sufiicient to point out that, when production data are 
used for this evaluation, the figures always reflect quantity, 
quality, or a combination of the two. Just what form the facts 
assume is quite often a function of the nature of the job or the 
work. A few examples from a variety of jobs will suffice. 

Grocery Checkers. — ^In one study involving cashier-checkers 
in a supermarket, forty-six employees checked a standard cus- 
tomer order and were given time scores. Figure 1-2 shows that 
the time in seconds required for checking tins particular order 


4 PRINCIPLES OP PERSONNEL TESTING 

ranged from a low of 225 seconds to a high of 675 seconds with a 
mean or average of 364. Checking, like many jobs, is one in 
which skill and speed am developed with expeiience. An analy- 
sis of tlie records indicated that seventeen of the forty-six check- 
ers had been on the job less than six months. Another curve 
(Fig. 1-3) was prepared in which only the twenty-nine employ- 
ees witli six mouths’ experience or more were included. Note, 



Fia. l^^Distrihutions of times required by twonty'-nino experienced order 
olieokeTB to check a standard order of groceries. 


however, that the time range, although not so large, is still from 
225 to 475 seconds and the average is 344. Even when the less 
experienced checkers or tlic learners are eliminated, the best per- 
formers are 35 per cent above average and the poorest arc 53 per 
cent below average. The best ones require a little less than half 
the time required by the slower ones, even when the beginners 
are not included. The implications in terms of number of check- 
ers needed to handle a particular volume during rush hours seem 
clear. 

Wool Pullers.— In packing houses, employees are engaged to 
pull wool from sheep pelts by hand. One study of number of 
pelts pulled involved thirteen employees, and the results are 
shown in Pig. 1-4. During the time studied, the average number 


THE BASIS OP PEBSONNEL TESTING 5 

of pdts pulled was 156. One man, however, pulled 216, or 138 
per cent of average, and another pulled only 119, or 76 per cent 
of average. There again, the ratio of the best to the poorest is 
almost two to one. 

Department-store Salespersons. — In the sales field, adequacy 
of sales personnel is frequently measured in terms of volume. In 
the curtain and drapery department of one large department 



Fw. 1-4. — Distribution of number of pelts pulled by thirtcon packing-house 
amployccs. 


store, ci^tcen clerks averaged a little over $4,000 each in one 
monthly period. One employee, however, sold over $7,100 worth 
of inorchandiso (177 per cent of average), and another sold only 
about $500 worili (13 per cent of average). In sales, however, 
total volume does not tell the entire story. Department stores 
expect to have some merchandise returned. The amount of per- 
centage of sales returned is one measure of sales success. In 
this same department, for this same month, returned merchan- 
dise records were studied. As Pig. 1-5 indicates, one employee 
had only 4.6 per cent of her dollar sales returned, while one other 
had 21.3 per cent of her dollar sales returned. All other em- 
ployees wore between these extremes, and the average return was 
in the vicinity of 9 per cent. Regardless of whether sales volume 
or return percentage is considered, tliere are extensive differ- 
ences between sales employees. 

Characteristics of Production Data. — ^The examples given 
above, purposely drawn from unlike jobs, demonstrate three 
specific facts. (1) Production records (quality, quantity, or 
other kinds) point to real differences in job performance. (2) 
The magnitude of the difference between the best performance 
and the poorest performance is in the ratio of approximately 


6 PRINCIPLES OF PSmiNNBL TESTING 

two to one or hdgher, depending in part on whether or not inex- 
perienced employeoa have been considered. (3) A distribution 
of the performance of a group of employees follows a character- 
istic bell-shaped pattern with the largest group in tlic center, 
near the average, a small group of exceptionally good employees 
at one end, and anoUier small group of oxccplioually poor cm- 



PiQ, IMBtribntion of pcrcenbiBo of mcrdinoiliM TDlunit^ for a RKntp of 
curtain and drapery Halcoladii'fl m a doparlnipiit Hlorcs 

ployees at the other, even when beginners have not been con- 
sidered. These arc well-known psychological facts that mani- 
fest themselves in job performance just os they do in the 
psychological laboratory and in all oUicr areas of human be- 
havior. 

Determiners of Differences. — It is not within tire province of 
this book to present an involved theoretical treatment of tJie 
factors that operate to produce these ability or performance 
differences that exist among people. Perhaps it is suflicient to 
point out that the multiplicity of hereditary and environraontal 
influences interacting with each oilier result in these human dif- 
ferences. It is important, however, to note that any individuol’s 
performance may be improved through better training and better 
work methods, that his performance may be either improved or 
hindered by changes in lighting, hours of work, and other specific 
job changes. However, the differences that have been mentioned 
above cannot be explained away in terms of these factors. Under 



THE BASIS OF PERSONNEL TESTINO 7 

given lighting conditions the bell-shaped distribution will be 
found. A change in lighting conditions may result m increased 
production; everyone may improve, but the cliaraoteristio pat- 
tern will remain. 

Other Employee Differences. — ^Some employees work less 
than a week at a new job and tlicn quit. Others work a little 



Fig. 1-0.— Difltribution of number of accidents in a six months' period for 6S0 
employees in an automobile-manufacturing plant. {From Nowbold, By permts- 
sion o/ the ConlxoUer o/ Kk BWtanatc MajQzty*^ StaUanary Oilice.) 

longer and then decide to leave. Most company records indicate 
that the greatest likelihood of losing an employee occurs in the 
first few days or weelcs of employment and that the longer an 
employee is on the job, other things being equal, the greater the 
probability of his staying with the company. A curve, then, con- 
structed with work periods on the base line, designed to show the 
percentage who quit in the first period after employment, the 
percentage who quit in the second period after employment, and 
so on, would assume a shape characterized as being high on the 
left and tapering off toward the base line as it moved to the right. 
This kind of curve is characteristic of certain types of data in- 
cluding accident and absentee data. 


$ PRINCIPLES OP PERSONNEL VESTINQ 

Accident JRecords. — Figure 1-6 taken from an English study* 
is typical of most accident distributions. In an automobile- 
manufacturing plant the records of 680 employees for a six 
months’ period were examined. As the figure slmws, 26.8 per 
cent of the group had no accidents at all; 21.8 per cent had one 
or two; 18.7 per cent had three or four; and so on, the greater 
the number of accidents, the smaller the pei’contago of employees 
falling in the category. 



Fio. l-7/~t)istribution of mimbor of absenoos in n six moutlis' poi-iod for 151 
emplqyoGfl ia a ciialing Bhop. {From Fox and SeoU.) 

Absentee Records. — Absentee records show a similar distribu- 
tion. Figure 1-7 shows the distribution of absences among 161 
employees * in a casting shop for a six months’ period. About 31 
per cent of the group had no absences during tlic period studied; 
about 22 per cent had one; and so on, the more absences, the 
fewer the employees falling into that categoiy. 

What Is a Good Employee? — Eveiy management must decide 
what it expects from its employees. Nearly everyone, however, 
would agree that those who are accident free, who come to work 
regularly, who stay on tlie job a reasonable length of time, who 
produce a lot of goods or sell a lot of mcroliandise, and who pro- 
duce goods of high quality or sell merchandise that is not re- 
turned are die best employees. Which criterion is most im- 
portant or how relatively important eadi is, is a question that 

* NttwnoiiD, B, M. A coturibulion to lha study of the Jaolor in Iho 

eausalioti o/ aecidotUa. Groat Britaiii Medical Research Council, Research Board, 
Reprint No. SS, 1020. 

* Fax, JoHH B., AND Scott, Jurouh P. .dluentcoum; manaoemeuFs jtroblem. 
Busmesa Research Study No. 20, Vol. 30, No. i, Dooombor 1043. Boston: Ilnr- 
TBTd Univoraty, Bureau of Buriness Rcsoarcti. 



THU BASIS OF PERSONNEL TESTINO 9 

must be decided in tlic light of each company’s own operations 
and its own policy. 

Why Test? — One fact, however, is universal. To the extent 
that a company can select more and moro employees who fall 
at the better end of these distributions and fewer and fewer who 
fall at the poorer ends of the curve, the better the result. In the 
case of the grocery checkers, for instance, if by means of per- 
sonnel tests it were possible to eliminate, prior to hiring, all 
checkers who require more than 400 seconds (after ax months) 
to check the standard order upon which Fig. 1-3 is based, the 
effect would be far-reaching. Either the store in question would 
need fewer checkers, or customers could be passed through the 
checking stands at a faster rate of speed, thus reducing time in 
line and improving customer good will. The best personnel on 
any job means lower costs, better customer relations, and fewer 
managerial and supervisory problems. Personnel testing can 
contribute to these objectives. 

What Tests Will and Will Not Do. — The question of what 
tests will do and what they will not do can best be answered by 
this whole book. A few words in the nature of a preview are in 
order, however. Tests are not a cure-all for all personnel ills. 
They cannot be used to clean up the results of mismanagement 
and supervisory bungling. They will not always work in every 
situation; and when they do work, they will not yield perfect 
results. The adequacy of a test or a testing program is evalu- 
ated, not in terms of perfection but in terms of batting odds. 
A particular test should not be (^iticized because it resulted in 
tlie hiring of one or two bad employees but rather should be 
evaluated in terms of whether or not it selected fewer bod em- 
ployees than the previously used technique. 

Tests as an Aid. — ^And finally it should be clearly understood 
that tests are not advocated as a substitute for tried and true 
selection and placement procedures. Instead, tests are instru- 
ments that yield facts about the applicant, which facts, in com- 
bination with other facts obtained from the application blank, 
from references, and from the interview, make possible a more 
intelligent and reliable hiring decision. 



CHAPTHB II 


PROCEDURES FOR CHOOSING TESTS 


Since it is clearly appai'enl that employees do differ in the 
degree to wbidi they measure up to the requirements of their jobs, 
tlie basic problem becomes one of selecting employment and 
placement tools that will aid in increasing tlm number of desir- 
able employees and decreasing the number of less desirable em- 
ployees. 

Tests Sometimes Misrepresented. — In many instances, tests 
have been misrepresented. Individuals inexperienced in their 
use have been led to believe that they will oocoinplisli far more 
than can rightfully be expected of them. Fii-st of all, no single 
test or combination of tests mil ever do a perfect job. There ore 
far too many personality factors contributing to successful job 
performance that ore not measured by presently available tests. 
An individual may have the capacity or ability for performing a 
particular job; however, if there are elements in his home life 
that keep him in a constant state of emotional turmoil, he is not 
likely to measure up to the level predicted by the test. For this 
and hundreds of other reasons like it, tests do something less 
thon a perfect job. 

Figure 2-1 illustrates this fact effectively. A test battery ^ 
consisting of three tests was set up for the purpose of identifying 
potentially successful naval electrical trainees. The scores on 
the three tests were combined into a single value which, in turn, 
was used to predict success in the electrical training program as 
measured by school grades. The prediction was about os accu- 
rate as can be expected with present tests, and yet tho results were 

1 tMWBHB, G. H„ Jk., and Thorntoh, Q. R. a tost buttery for idontifyuig 
potentially aucoessful naval electrieal trainoee. J. eppl, Pn/ohd., 1043, 27, 300- 
400. 


10 


PItOGEDVRES FOR CHOOSING TESTS 


11 

something less than perfect as the figure shows. Eacli bar repre- 
sents 20 per cent of the trainees^ with the group having the high- 
est combined tost scores at the top and the group having the 
lowest scores at Ihe bottom. The ^aded portion of each bar 
represents the percentage of the group that received school 
grades below the average of the whole group, whereas the solid 

B«$t 20% 
o 

|Naxl20% 
e N«Kt 20% 

h 

^ Next 20% 

Low 20% 

Per cent betow | Per cent above . 

class crade overage I doss grade overogo 

Fia. M.— Kaval electrical iralncca assigned to flvo groups on basis of com^ 
binod test scores to show proportions abovo and below avorago class grade. 

section indicates the percentage with grades above average. In 
spite of tlie fact that the prediction here is as good as is usually 
found, among the best 20 per <^nt on the test, 3 per cent fell 
below average in school performance. Tests sometimes indicate 
the selection of an individual who does not turn out well on the 
job; sometimes applicants who would have become good em- 
ployees are turned away. It must be borne in mind, however, 
that the some statements can also be made with respect to the 
interview or any other device that is used to select or allocate 
employees or applicants. The question becomes one, therefore, 
not of whether or not tests do a perfect job but of how much 
better than present methods they are. It must constantly be 
remembered, however, that almost never will a test or combi- 
nation of tests predict so accurately that every applicant who 
scores above a given point is sure to succeed, while every appli- 
cant who scores below that point is sure to fail. 

Standards Based on Opinion. — second result of misrepre- 
sentation has come about when certain experts have professed to 




PRINCIPLES OF PERSONNEL TESTING 


12 

be able to set up test batteries on the basis of subjective analyses 
alone. Such a proccdui'e, unless it is accompanied by accepted 
fact-finding techniques, is an extremely hazardous approach re- 
gardless of whether it is employed by an outside expert or by an 
employee of the company. Frequently, the adequacy or inade- 
quacy of the recommendations that are made' is never known. 



DIslanca acull)( both oyos 


Fia. 2-2 — ^Avcnigo mimbor of daxmia of pairs of hoso proorssod por hour by 
omployoca performing at various levels on a tlistnnco aciuly lest. {Vrom Tiffm 
and WitQ 

How well these who wore turned away might have done is a 
question that is rarely asked, much less answci'ed. Sometimes 
these guesses result in the selection of persons who arc actually 
poorest on the job. An example from the field of vision testing 
is a case in point. 

Looping is an operation in the making of hosiery and involves 
meticulous activity on the part of the operator *■ at about eight 
inches from the eyes. The relationship between the production 
records of 199 loopers and their scores on the standard Sn ell e n 
cliart vision test at twenty feet is shown in Pig. 2-2. As shown 
here, those employees with the best distance acuity scores 
actually looped fewer dozens of pairs of hose than did those who 

* Tiwih, JoBKPn, AND WiRT, S. E, Near vpraus distance visual aoiiity in roln- 
faon to aueooss on close indiislrial jobn. iSuppIomont to Tram. Amm. Acad. 
Ophihalt and Ototar^ JRoclicalor, Minn., Juncj H)44» 



PROCEDURES FOR CHOO&INO TESTS 18 

made a poorer showing on the Snellen chart. Discussion of this 
fact is postponed to Chap. VIII, but the example serves as an 
excellent illustration. To take the point of view that '‘since the 
job of looping requires ‘good’ eyes, we should select those appli- 
cants who can read die Snellen chart the best,” actually resulted 
in the best applicants being turned away. Testing programs 
based entirely upon guesses or estimates sooner or later fall by 
the wayside. The best technician, the personnel manager, or the 
director of industrial relations eventually is asked questions that 
he cannot answer unless he is fortified with the facts. Sometimes 
these questions are raised by top management, sometimes by 
supervision, and sometimes by the union. Facts are just as 
essential to the operation of a personnel testing program as they 
are to the operation of a life-insurance company. Without a 
knowledge of mortality rates at different ages, for example, an 
insurance company could not operate. Without the facts regard- 
ing the probabilities of improvements resulting from test usage, 
the testing program is doomed to failure. 

Two Techniques. — ^Thore arc two basic fact-finding techniques 
whereby it is possible to know whether or not a given test or 
combination of tests should bo used as a tool in allocating per- 
sonnel to a particular job. Both of these are known as validating 
techniques and have as their purpose testing the test. A given 
test may be excellent in connection with one job and virtually 
useless in connection with another job. Furthermore, job classi- 
fications tliat seem similar from plant to plant sometimes differ 
significantly; so it becomes essential to test the test in practically 
every new situation. Such validation procedures simply answer 
the question “Does this test aid in identifying those persons who 
are most apt to be successful on this particular job?” These 
methods ore referred to as the present employee method and the 
foUow-up method. 

PRESENT EMPLOYEE METHOD 

The present employee method is most frequently used and gets 
results most quioldy. It consists of five simple steps: 



PRINCIPLES OF PERSONNEL TESTING 


14 

1. Analyze tho job 

2. Select a trial battery 

3. Identify ciitcrion groups 

4. Administer the trial battery 

5. Compai'e test I'esults 

Analyzing the Job.— The particular job in question sliould be 
studied for the purpose of estimating the spocifio demands that 
it seems to place upon the employee. Is the employee required 
to read blueprints; must he be able to perforin the common 
arithinetio operations; does extreme finger dexterity seem to be 
required? The extent and formality of the job analysis will 
depend upon the test ledmieian’s familiarity with tho job. The 
objective, however, is to list the apparent abilities, skills, and 
attributes that are measurable with present test methods. 

Selecting the Trial Battery. — ^The next stop is to select a 
number of tests that arc intended to incnsuro tho traits or attri- 
butes reflected in the job analysis. Tho list of tests ivoscntcd in 
Appendix C should be suflioicntly inclusive for most purposes. 
However, tho Mental Meamreincnts Yearbook ‘ may also bo use- 
ful. The number of tests or tho number of hours of testing to be 
included in the trial battery will depend upon tho availability of 
the employees for testing as well as the relative importance of 
adequate placement on the job in question. The more important 
it is to have good employees on a particular job, the more time a 
company can afford to put into the development of an adequate 
test battery. In the case of bakery routemon, for example, 
poorly selected men not only result in lower sales but sometimes 
are the cause for losing a particular grocer as a customer, a con- 
dition that cannot easily be rectified. Tho general rule is that 
the more serious hiring mistakes con be, the more time tire com- 
pany can afford to spend in sotting up the test battery. Gener- 
ally spealdng, however, throe hours of well-chosen tests will give 
satisfactory results. 

^ Bukos, OaciUt K. (Ed.). The 1040 mental meamremenle yeaxhook. High- 
land Park, N J.: Mental Measurements Yeaibook, 1041, 074 pp. 



PROOEDURSS FOR CHOOSINO TESTS 16 

Identifying Criterion Groups. — ^The next step consists of 
selecting two groups from the ranks of the employees who are 
presently performing the job, one consisting of employees who 
are considered satisfactory in the pci'formance of tlie job and 
the other composed of employees who are falling shoi’t of the 
demands being made of them. Various measures of job success 
for classifying present employees are treated in Chap. Ill, The 
important thing is to select two criterion groups, one consisting 
of those considered satisfactory and the other consisting of those 
considered generally unsatisfactory. 

Administering the Trial Battery. — Once tliese two groups of 
employees have been identified, tliey sliould be called together in 
the conference room, the cafeteria, or some other suitable meeting 
place (see Appendix B) and asked to take the testa. They should 
not be told that some of them have been selected because they are 
performing tire job satisfactorily and that others have been chosen 
because they are not doing so well. They should be advised, how- 
ever, tliat the company in its efforts to find ways of placing the 
right people on tho ri^t jobs in the future is presently engaged 
in testing the tests. Key points well worth keeping in mind are ; 

1. No employee should be compelled to participate. 

2. Each employee should be given the opportunity to withdraw 
without embarrassment if he so desires. 

3. It should be made dear that no one’s status with tlie com- 
pany will be affected by his performance on the tests. 

In the event that the employees in the company are organized, 
certain precautions should be taken. Although it is manage- 
ment’s prerogative to decide whether or not it will use tests as a 
selection and placement tool, management does need the coopeiv 
ation of the union when it is in the process of validating its tests 
by the present employee method. For this reason, it is well to 
contact tlic shop committee ahead of time. Such a contact should 
not bo made in the spirit of “will you permit us to do this?” but 
rather in the spirit of informing the employees. Managements 
that disregard this suggestion frequently fail m their testing pro- 



15 PRINCIPLES OF PS^ONNEL TESTING 

grams. Without the proper information in tho possession of tho 
employees, trouble usually results. Where labor relations have 
previously been good mid where the story is honestly and ade- 
quately presented, cooperation usually follows. In those rare 
instances in which this is not tho case and the union chooses not to 
cooperate, there is usually notliing to bo gained by forcing tho 
issue. In these instances it usually pays to substitute tho follow- 
up method discussed later. 

Comparing Test Results. — Various metliods for analywng and 
presenting tlie results of this tryout procedure arc discussed in 
Chap. IV. However, the most simple and obvious approach will 
be discussed here. Suppose that having selected and tested two 
criterion groups, it appears that both groups average about the 
same on a given test. It can be concluded that this test has no 
value for purposes of identifying the potentially successful em- 
ployees on. this job. By this very simple procedure or one of the 
variations of it discussed in Qiap. IV the facts can bo collected. 
These facts are absolutely essential if a program fa to succeed. 

THE FOLLOW-UP METIIOD 

In reality, the follow-up method is only a modification of tho 
present employee method. As the name suggests it involves tho 
use of newly hired employees not presently on tlio job. It like- 
wise consists of five steps, only two of which differ from those in 
tlie above method. 

1. Analyze the job 

2. Select a trial battery 

3. Test all new hires 

4. Classify them into criterion groups 

5. Compare test results 

Testing New Hires. — Since tho first two steps arc identical in 
the two methods, no further discussion of them seems necessary. 
This procedure colls for the administration of the trial battery to 
all new people who are placed on the job. This icsting can be in- 
tegrated with the usual hiring procedure, and no comment need 
be made to the employee regarding the use of tests for the first 



PnOCEDVBBS FOB CHOOSING TESTS 17 

time. It is well, however, to follow the suggestions presented in 
Appendix B so as to ensure msximuni cooperation from the appli- 
cant. Most important of all is the fact that during the tryout 
period test scores should not be knovm by interviewers or others 
involved in the hiring procedure. Extreme care should be exer- 
cised to prevent the test scores from influencing those responsible 
for hiring. This is because it is not known at this time whether 
or not the test is correlated with job success. At no time should 
the person responsible for testing lose sight of the fact that this 
is a tryout period and that scores should not be used until the facts 
are known. Tests and scores should be filed for use as indicated 
below. 

Classifying the New Employees. — ^After a sufficient number 
of new employees have been tested and placed on the job and after 
sufficient time on the job has elapsed to permit judgments to be 
mode or records to have accumulated, these employees should be 
segregated into criterion groups in the same fashion as suggested 
in connection with the present employee method. One of the 
approaches outlined in Chap. Ill will usually suffice. The ques- 
tion of when an employee has been on the job long enough is 
sometimes a perplexing one. The time period will vaiy with the 
type of operation and the nature of the training program. In 
operations where individual production data are available it is 
sometimes wise to plot the average performance by days or weeks 
during the initial periods of employment. Figure 2-3 shows such 
a curve for five employees learning to inspect ophthalmic lenses. 
Even though these employees began at different times, their 
records were grouped for their first day on the job, their second 
day, etc., and hourly averages were computed. Although this par- 
ticular learning curve does not cover a sufficient time period to 
permit a valid judgment to be made as to whether or not the initial 
learning period has passed, nevertheless it is typical of such learn- 
ing curves in that it is characterised by a rapid initial rise plus a 
gradual flattening off. On this particular job, learning seems to 
be reasonably rapid. However, on many jobs where learning is 
slower the base line can be more advantageously plotted by weeks 
or even months. The point is that no attempt to categorize new 



18 PRWGIPISS OF PESSOffNEL TESTINO 

employees into criterion groups should be made until there is some 
evidence that the initial learning period has passed. 

Comparing Test Results.— Oiico those uoav hires arc classified 
into criterion groups, the procedure is then identical with that 
used in the present employee method. The scores on a given test 
are averaged for each criterion group, and the averages of the two 
groups are compared. Only if there oi'c significant differences in 



Fk>. 2-3— Average niimbor of ophthalmic lens inupootcd per hour by loamem 
during their lint few days on the job. 

the average scores of the two groups does the tost differentiate 
between good and poor employees. Attention is called to other 
methods for analyzing and comparing data presented in Chap. IV. 

COhlPARISOir OF METHODS 

There has been no intent to imply here that the use of either 
method of test validation presupposes the exclusion of the other 
method. Probably the ideal approach is to use both methods, 
checking the results, one against the other. Each method, how- 
ever, has its advantages and disadvantages. 

Advantages of Present Employee Method. — ^Perhaps the 
greatest advantage of the present employee method is the speed 
with which results can be obtained. If hiring is slow, it may take 
six or eight months before a sufficient number of new people have 



PROCEDURES FOR CHOOSING TESTS 19 

been hired to permit a comparison by the follow-up method. 
Kesults may be obtained in a matter of hours widi the present 
employee method. 

Advantages of Follow-up Method. — (1) The priinaiy advan- 
tage of the follow-up method lies in the fact that tlie amount 
of selling which needs to be done prior to inaugurating the pro- 
gram is greatly minimized. It is not nearly so necessary to get 
prior support from supervision, the union, or others. All of that 
will come after facts have been accumulated, in which case the 
facts speak for themselves. (2) The testing situation in which 
applicants are tested during the tryout period is essentially the 
same as that in which subsequent applicants will be tested. There 
can be no objections based upon the real or imaginary argument 
that applicants are different from employed personnel. Occa- 
sionally there are instances in which the present employee method 
cannot be used. If the operation is a new one and there are no 
present employees or if men have always been employed in the 
past and women are to be hired now, obviously the follow-up 
method must be used. 

Which to Choose.— The question of which method to use is 
one that must be answered in the liglit of the circumstances. If 
these circumstances permit a free choice of either, it is probably 
best to use both and to check the results, one against the other. 
Should the results of such a dud tryout differ, those resulting from 
the follow-up method should usually be accepted. 

SUMMARY 

Tests can contribute materially to better personnel selection 
and placement. The choosing of tests for a given purpose, how- 
ever, must be done systematically and scientifically. Without 
actual facts indicating how well a test or group of tests identifies 
the superior employees, no intelligent judgment can be made as to 
whether or not that particular test should be used. Two general 
techniques for testing the test are available ; the present employee 
method and the follow-up method. Eat* has its advantages, and 
the decision as to which should be used is one that must be made 
in the light of conditions existing in the particular plant or com- 
pany. 



CHAPTIOB III 


MEASURES OF JOB SUCCESS 


Identifying two groups of employees, one of which is known to 
be good on the job and another which is known to be poor on the 
job, is in many respects the most important step in the test- 
validation procedure. This is true regardless of whether the 
present employee method or the follow-up method is used. IIow 
well or how poorly this step is done will often determine how suc- 
cessful the chosen tests will be in the future. 

Criteria of Success Vary. — ^Tho question of who is most suc- 
cessful on the job is associated with iho nature of tho operation. 
For example, an operator diatged with tho rcsimusibility of finish- 
ing an aircraft propeller blade that has already had a thousand 
dollai's worth of labor put into it makes a very costly error if he 
damages the blade so that it cannot be salvaged. How many he 
finished in a given period of time is relatively unimportant, tlie 
best measure of success on his job being tho quality of his work as 
evaluated by the amount of damaged material. On tho other 
hand, the best criterion of success in the assembly of a certain typo 
of electrical fixtures would very likely bo quantity. In a com- 
petitive market on products involving a relatively high labor cost 
it is perhaps more important to select operators who produce a 
lot, even at the expense of errore. Here the measure of the suc- 
cessful operator is in terms of how many he produces or assembles. 
On some jobs the most important item is the reduction of personal 
accidents; so the best employee is the accident-free individual, 
and the poorest employee is the acoident-prone person. The 
terms “good” and “poor” applied to employees ai'o relative and 
are matters of definition which must bo mode in tho light of the 
critical factors inherent in the job itself. 

ao 


MEASURES OP JOB SUCCESS 


21 

Success a Matter of Policy. — frequently this definition of 
success is a policy matter, independent from the operation itself. 
The definition of a good salesman, for example, varies consider- 
ably from company to company. Is he the person who mokes the 
most calls, the one who secures the most new accounts, the one 
with the greatest volume, the one with the greatest volume on 
certain “long profit” items, or is he the one with the fewest cus- 
tomer complaints? 

for example, one life-insurance company might build its busi- 
ness strictly on annual volume with little regard for cancellations 
os long as the volume is high. The management of another com- 
pany might feel that volume alone does not teU <h.e whole story. 
Such a company might prefer to sell less to each individual, keep- 
ing the amount within his ability to carry it, and mako its quotas 
through sales to more individuals. It is clear that volume alone 
does not tell the complete story. “What is a good insurance sales- 
man” can very well differ from company to company. 

Availability of Facts. — ^Not the smallest consideration in 
selecting a criterion is the availability of facts. Entirely too often 
for comfort, the personnel test technician finds either that in- 
dividual records on employees have not been kept or, if they have 
been kept, that they are not filed in a way which makes them 
available with a reasonable amount of clerical work. One service 
that can nearly always be performed by the personnel man is the 
encouragement of management and supervision to keep adequate 
records. Who is producing the most and who is making the mis- 
takes is a question that can be answered all too infrequently. 

Four Classes of Criteria.— Little can be said here regarding 
what standards or policies should be considered in setting up the 
two groups of employees for a specific job. The purpose of this 
chapter is to present a reasonably comprehensive list from which 
can be drawn measures that are most appropriate for a specific 
job or operation in a specific company. In general, there are^ four 
broad areas: (1) production data, (2) personnel data, (3) judg- 
ments of others, and (4) job samples. 



22 


PRINC1PIB8 OF PERSONNEL TESTING 


PRODUCTION DATA 

Production data, as the name implies, are facts of a quantita- 
tive, numerical sort tliat are already available or tliat can be made 
available in the future. As a measure of job success they are use- 
ful only where people work as individuals rather than in teams 
or on lines. 

Qmntity . — ^IIow many pieces or parts per hour, day, or week? 

Time . — ^How long to do a particular job (usually applicable to 
service types of operations)? 

Quality . — ^How many rejects or reworks? How many dollars 
worth of scrap? How many en’ors? 

Earnings . — How many cents per hour (where piece rates or 
bonuses are in effect) did the employee earn? Did he earn a 
bonus? 

These are perhaps the obvious ones. Others will suggest tliem- 
selves as specific jobs ore studied. 

Sales Criteria . — ^Although the general problem of validating 
teats for salesmen is essentially the some as any other validation 
problem, the matter of sales criteria is by nature unique. 
Ohmann ^ has presented a coinproliensivo list from which tlie fol- 
lowing ones have been adapted : 

Sales volume fqr a given period of time 
Average number of calls per day 
Net commission eej'nings 
Average number of soles per month 
Average size of order 

Average number of new accounts per month 

Average sales volume pei* year for period of time employed 

Sales volume for first six months on the job 

Trend of sales volume over period of years 

Amount of allowances to customers 

Amount of returned merchandise 

Classes of trade called on 

Classes of products sold 

1 OuHAHN, 0. A. A report of reaearoh on the solootion of MileBmen at (ho 
Ttoioeo Manufacturing Compiuiy. I, oppf. PtyehoL, 1041, S5, 18-20. 



MEAgUnBS OF JOB SVCCE^ 28 

Ohmann contends that in his own work, net commission earnings 
have been tlie best criterion, although there is no implication that 
any single criterion should be used at the exclusion of the others. 

PERSONNEL DATA 

Nonproduction Measures. — Separate and somewhat apart 
from how mudi or how well an employee produces are certain 
facts about him that deserve consideration in the identification of 
superior employees. For example, whether or not an employee 
misses work often is a factor to be considered, even though he is a 
good producer while he is on the job. Other things being equal, 
the employee who is ^ways there is a better employee. These 
nonproduction facts usually involve the use of personnel data, 
some examples of which follow. 

Absenteeism . — Who are the best attenders, and who are the 
poorest attenders? Fox and Scott ‘ have reported that the num- 
ber of absences per unit of time is a better measure than number 
of days absence per period of time, any continuous period off the 
job being considered an absence whether it is one day or two 
weeks. 

Length of Service . — ^This is useful only when the follow-up 
method is used when employees can be classified into those who 
either terminated or were terminated within a six months’ period, 
for example, and those who remained on the job six months or 
longer. 

Bate of Advancement . — How long was the individual on tiie 
job before he was promoted to abetter one? 

Traming Time . — How long did it take the individual to learn 
the job? Sometimes employees can be grouped into those who 
attained a given production levd in a certain period of time and 
those who did not. (Sometimes this is classified under production 
data.) 

Accidents , — ^Employees may be divided into those who had one 
or more accidents during a given period of time and those who had 

^ Vox, John D., and Sonr, Jbiiomb F. Absenteeism: manaeemmt’s problem. 
Bunneas RGseorob Study No. 29, Boston: Harvard University, Sohool of Business 
Administration, 1013. 



24 PBINCIPLES OF PERSONNEL TESTING 

none during the same period. Lost time, home cases, hospital 
visits, or otlier indexes of severity may be utilised. 

All of these measures likewise employ quantitative facts that 
are presumably available and are indicative of job success. To 
this list may be added othci'S that ore more or less specific for cer- 
tain jobs; for example, in most service types of jobs, customer 
complaints are important. TIio list supplied above is general for 
most industrial jobs and will servo to suggest others. 

JUDGMENT OF OTHERS 

Frequency of Use. — Tlie individual engaged in test validation 
will use supervisory judgments in the form of ratings, ns a general 
rule, more frequently Ihoir he will use all of the other suggestions 
in this chapter combined. This is true because more times than 
not the job will bo of such a diaracter that individual records can- 
not be kept or, in those instances where records can bo kept, they 
will not be available. Therefore, it is Important to considei* some 
of the common pitfalls of employee-rating methods and to 
examine means of eliminating or at least minimizing weaknesses 
that tend to reduce the reliability and the validity of the ratings. 

Weaknesses of Scales. — ^Most merit-rating systcras presently 
in use in industry, though better than nothing, arc not sufficiently 
valid or reliable for uso as criteria in testing the test. Borne of the 
common weaknesses follow: 

1. They do not take into account or correct for the individual 
differences among ratcra. 

2. When several items are included, the "halo*’ effect is so great 
that the scales actually yield much less than they appear to. 

3. No effort is made to compensate for job differences. 

Differences between Katers. — ^The English language is not 
sufficiently exact for a given employee description to mean the 
same to everyone who reads it. In addition, the rater’s own tem- 
perament is reflected in any mental standards that he may set. 
And so when two or more raters are asked whether an employee’s 
work habits are poor, average, or excellent, different answers are 
obtained, even when the raters have had on equal opportunity to 



MEASURES OP JOB SOCCER 


26 

observe the man’s work. “Excellent” means different things to 
different people, and supervisors are no exception. 

Figures 3-1 to 3-4 are hypothetical examples to illusti’ate dif- 


• • 

• • • 

• • • 


T — I — r- r — ■ I — I — I — I 
Poor Av«. Good 

6 6 (0 12 14 16 18 20 

FiO, S-1^ — ^Distribution of ratings of fifteen men by Supervisor A| a^'emging 10. 


• # 


r- 

Poor 


1 I I ”r 

Avt. 

6 e ro 12 14 


I" T 


Qood 


16 IB 20 

Fiq. 3-2^Distribution of ratings of fifteen men by Supervisor averaging 17. 


Poor 


I I ' “r 

Ave. 

10 12 14 


Good 

20 


Fio. 


6 8 10 12 14 16 16 

-Distribution of ratings of fifteen men by Supervisor C, averaging 13. 


« « * 

• « • » 


I r r i — I — I— I — ^ 

Poor AM. Good 

6 8 to 12 14 16 IB 20 

Fia 3-4. — ^Distribution of ratings of fifteen men by Supervisor D^ averaging 13. 

ferent types of raters rating the same fifteen men. Eating values 
on this particular scale item range from 6 to 20 in two-point steps. 

Note that Supervisor A (Fig. 3-1), who might be thought of as 
the “hard-boiled” variety, rated most of the fifteen at 8, 10, or 12 
and that his ratings averaged 10. 

Supervisor B, the “easy-going” variety (Fig. 3-2) , rated most of 
his at 16 or 18 with an average for the fifteen ratings of 17. 




26 PRINCIPLES OF PERSONNEL TESTING 

Supervisor C (^Fig. 3-3), the “Caspar Milquetoast” variety, be- 
ing afraid to make decisions, rated all but three men at 12 and 14 
and had an over-all avere^e of 13. 

Supervisor D (Fig. 3-4), using the scale to the best advantage, 
spread the men along the entire range of tho scale but also had an 
average of 13. 

Table I gives a summary of the four distributions and shows at 
a glance the fact that tliis scale can produce vastly different 
results in the hands of various supervisors. A rating of 10, for 

TABLE I.— SDMMABY OF EATINGS OF FOUR SUPEBVKORS 
FOR FIFTEEN MEN 


Supervisor 

lliitings 

Highest 

Average 

Lowest 

A 

10 

10 

0 

B 

20 

17 

12 

C 

16 

13 

10 

D 

20 

13 

0 


example, means average when given by A and very poor when 
given by C. A rating of 16 by A or C means very good but means 
below average for B, It can readily bo seen that any pooling of 
ratings by different judges can mean chaos unless somo approach 
is utilized to ensure greater uniformity. The training of super- 
visors will help somo, but these differences which are a function 
of temperament as much os interpretation will always remain 
when not controlled. 

While these examples ore hypothetical for purposes of illustra- 
tions, they ore not exaggerations and are truly indicative of what 
is found in industry. Figure 3-5 shows tlic actual distribution of 
ratings for three different departments in a steel mill. An eleven- 
item merit-rating scale was used; points were awarded on each 
item; and total points were converted to A, B, C, or D ratings 
for each employee. Note the discrepancy in tho proportion of 
ratings from department to department. Noto that an employee 
in the annealing and pickling department has eight chances in a 
hundred of receiving a D while no one in the other two received 



MEASURES OF JOB SUCCESS 27 

D’fl. His chances of receiving either A or B are 78 out of a 100 in 
the transportation department and 45 out of a 100 in the anneal- 
ing aaid pickling department. Although the ability of employees 
to do their jobs can vary from department to department, it is not 
reasonable that eight out of a hundred in one department are bad 
and that there are no bad employees in another. Ratings fre- 


63% 



D C B A 
Transportation dept 
(N»4!) 



0 C B A 
Annealing and 
pickling dept 
(N=67) 


56% 



Maintenance 

(N»9) 


Fra. 3-6. — Distributions of mont-raiing scores in three rlilTcrent departments 
of a steel null showing tlio pcrconlage of men receiving K, B, C, and D rat- 
ings. 


quently tell more about the person doing the ratings than about 
tile employees being rated. 

Halo Effect. — When persons who prepare merit-rating sys- 
tems include long lists of items such aa initiative, cooperation, 
ingenuity, and so on, the implication is that the person doing the 
rating can look at an employee one way and see only '‘initiative,” 
that he can look at him another way and see only “cooperation,” 
and so on. Such is contrary to the fact as is demonstrated by 
reseai’ch studies.*- In reality, the supervisor thinks of an indi- 
vidual largely in terms of how well he believes he is measuring up 
to what is expected of him. In other words, job performance is 
what the supervisor sees, and he sees it when he is looking for 
initiative, when he is looking for cooperation, and when he is look- 
ing for ingenuity. It is not strange, then, that when a supervisor 
attempts to rate his employees on these complicated rating scales, 

1 KwAnT, Edwin, Srabhokg, S. E., and Tifpin, Josepk. A factor analysis of 
an industrial merit rating soalo. J. appl. Psychol., 1941, 25, 481-486. 



28 PRINCIPLES OF PERSONNEL TESTING 

he tends to place a given employee nt about the same position on 
each scale, regardless of what that particulai* scale is called or 
what it was intended to measure. Figure 3-6 shows the actual 
ratings of fifteen men drawn at random from somo 0,000 em- 
ployees in anotlier steel mill. The rating scale * used employed 
eleven items, and tlic figure sliows the relationship between tlie 



Fio. S-Sr-Scaltori'rania flliowinR rclalionflhip holwcen protiuctivlty anri jiulf;- 
mont mtingt) ond produclivily and ppraonnltly mtingfi for ilflron inon in n alool 
mill. 

"productivity” item and the "judgment” and “personality” items. 
Note the reasonably close agreement. An individual who was 
rated at a given point on the pi-oductivity scale tended to bo rated 
at about the same point on each of tlie other two. This is known 
as the "halo” effect and suggests that the many different items oi'e, 
in reality, measuring about the same thing and that some greatly 
simplified system might do ns well or even better. 

Job Differences. — ^In theory, at least, it sliould bo just as 
nearly possible to have a janitor who is doing a 100 per cent job 
as it is to have a tool- and diemaker who is doing a 100 per cent 
job. Figure 3-7 shows five jobs selected at random from Tiffin’s 
longer list." Actually, tlie average number of merit-rating points 
in each job classification decreases as the skill level of the job de- 
creases. This suggests that these employees have not realty been 

1 Tinrm, JoeRPH, and Mubsbh, Wayhr. Weighting merit rating items. /. 
apfi, Psychol^ 1042, 26, 676-583. 

* TirriN, JoSM’ii. Industrial Psyehology. Now York: Prontice-llall, 1017, 
p.834. 


MJSMORES OP JOB SUCCE^ 29 

measured against the demands of their respective jobs but that 
their general competence as individuals has been evaluated with- 
out reference to job classification. 

Suggested Plans. — Best results have been obtained by using 
one or both of two very simple plans, either of which will over- 
come most of the weaknesses cited in connection with usual mcrit- 


rurnaea oparator 
MMnmtii 
Machinist helper 
Oong laborer 
ilqnltor 




L r 









l 




i i I 

£60 300 360 400 


Avoroge point rating 

yiQ. 3-7.— Avorngc iotftl point nttin^ tor men in fivo job ciniBificniionB in n 
steel null. (Prom Tiffin.) 


rating plans. One is called the card’-stacking method, and the 
other is called the ranking method. Fundamentally they are the 
same and differ only in certain minor respects. 

The Card-stacking Method. — After the supervisor who is to 
do the rating has been led to see the need and value of rating 
people on the job for purposes of test validation, he is supplied 
with a stack of cards similar to the one shown in Fig. 3-8, each 
of whidi bears the name of one employee under his supervision. 
Next, he is asked to make three rough groupings of these cards, 
placing in one stack his superior employees, in one his below- 
average employees, and in the other his average' people. If he 
argues that he has no above-average employees, for example, he 
is reminded that “some employees ore better than others” and 
that he should put the best he has in one stack. After this is done, 
reference is made to Table II and he is asked to make a second 
grouping. For example, if he has placed more than 30 per cent in 
the best group, he is asked to identify the poorest so-many and to 
move them to the middle group. In other words, his first group- 
ing is corrected so that ho has 30, 40, and 30 per cent in the re^ 


80 


PRINCIPLES OP PERSONNEL TESTINO 


TABIjE n.~F01l DETERMINING NUMBER OF EMPLOYEES 
TO BE PLACED IN BACH CATEGORY WHEN USING 
TUB CABD-STAOiaNQ METHOD FOR RATING 


SoGOnd grouping 

Third grouping 

No. 

rfttod 

Best 

80% 

Mkldlo 

40% 

Poorest 

30% 

No. 

rated 

Boat 

10% 

Noxt 

20% 

Middle 

40% 

Noxt 

20% 

Poorest 

10% 

10 


4 


o 

1 

2 

4 


1 

11 


6 


IB 

1 

2 

6 


1 

12 


4 


11 

1 

3 

4 

3 

1 

13 


6 


H 

1 

3 

6 


1 

14 




14 

1 

3 



1 

16 

6 

6 

6 

16 

2 

3 

6 


2 

16 

6 

■■ 

5 

16 

2 

3 

■a 


2 

17 

6 


6 

17 

2 

■■ 

■I 


2 

18 

MM 

■9 

■■ 

18 

2 


HI 


2 

10 

■1 

■1 

■■ 

19 

2 

mm 

91 


2 

20 

H 

8 

6 

20 

2 

mM 

8 


2 

21 

mm 


■a 

21 

2 

WM 

0 


2 

22 

6 

10 

mm 

22 

2 

Mm 

10 

4 

2 

23 

MM 

0 


23 

2 

6 


5 

2 

24 


10 


24 

2 

6 

10 

0 

2 

25 

^9 

11 

^9 

25 

2 

6 

11 

6 

2 

20 

8 

10 

8 

20 

3 

6 

10 

6 

3 

27 

8 

11 

8 

27 

3 

6 

11 

5 

3 

28 


10 

MM 

28 

8 


10 


3 

20 

9 

11 

u 

20 

8 

e 

11 

0 

8 

80 


12 

^9 

n 

3 


12 


3 

81 

9 

13 

9 

81 

3 

0 

13 

0 

8 

32 

0 

14 

HB 

32 

3 

■9 

14 

■9 

3 

38 

10 

13 

10 

33 

3 


13 


8 

84 

10 

14 

10 

34 

3 


14 


8 

86 

11 

13 

11 

36 

■■ 

wm 

18 


4 

36 

11 

14 

11 

36 


^9 

14 

■9 

4 

37 

11 

16 

11 

87 

■■ 


16 

^9 

4 

88 

12 

14 

12 

88 

4 

8 

14 

8 

4 

89 

12 

15 

12 

30 

4 

8 

16 

8 

5 


Bpeotive Btocks. Finally, again using Tablo II, he is asked to 
make a third grouping so that he ultimately has five stacks of 
cards containing 10, 20, 40, 20, and 10 per cent of the total group, 
respectively. He then mokes an x in the appropriate space at the 














































MEA8UBBS OP JOB SVCCEBS 


31 


HOW TO OBTAIN SUPERVISORY RATINGS FOR USE AS TEST 

CRITERIA 

STEP 1 , — Get supervisor to see need for rating 

Use group conference or individual interview 
Explain what he has to gain 
Keep approach infonnal 
Explain confidential nature 
Emphasize present job 

STEP 2 , — Provide him with typed cards 
Three by five inches 
One name to a card 

STEP 3 . — Have him make three rough groupings 
“Some employees ai’e better than others” 

One pile “best” 

One pile “poorest” 

One pile “in between” or “average” 

Throw out any “don’t knows” 

STEP A, — Correct distribution to 30, 40, 30 per cent (second 
grouping) 

Use informal approach 
“Who is poorest in this group?” 

“Who M best in this group?” 

STEP 5 . — Have him extend the distribution to 10, 20, 40, 20, 10 
per cent (third grouping) 

Have him identify the extreme employees 
“Who are the best so many in the high group?” 
“Who are the poorest so many in the low group ?" 


STEP 6 . — Record ratings on cards 
Use A,B,C,D,Eor 
Use 6, 4, 3, 2,1 



PRINCIPLES OP PERSONNEL TESTING 


82 

bottom of the card and dates and initials or signs it. Letter ratings 
of A, B, C, D, and E or numerical ratings from 1 to d can later be 
assigned. The system works well when the supervisor has ten or 
more employees. A simplified step-by-step procedure is pre- 
sented on page 31. 

The Ranking Method. — ^Thc ranking method is quite similar 
except tliat it is better adapted to small groups of fifteen or less. 


Noma - ■ Dopt - - 

Lost First 

Job Performonco 

How satisfactory Is this employee in the performonce of his 
present Job? 

NOTE* Consider such items as general productivity, 
Including quality ond quantity of work, specific 
Job knowledge, safely, industry, dependability^ 
cooperation, and persevorance. 


PooroBt 

Next 

Mtd(U« 

Next 

Dest 

10% 

2C% 

40% 

20% 

I07o 

□ 

Date 

□ 

□ 

□ 

Rated by 

□ 


Fia. 3-8^-^amplQ form {or ubo in rating omploycca by tlio card-Eiaohino 
method. 

The supervisor is likewise handed a pack of cards, each one bearing 
an employee's name (see Fig. 3-9), In this instance, since the 
number is small, he is simply asked to rank the caids, placing the 
one bearing tire best employee’s name on the top. Sinco extremes 
are more easily identified, it is best to ask him to select tho best 
and poorest first. Then, among those remaining, select the best 
and the poorest and so on. Entries are mode on the cord to in- 
dicate the number rated and the employee's rank number. Later, 
by means of Table III, a value from 1 to 5 may bo assigned to 
make the ratings roughly compoi'able to those obtained by the 
card-stacking system. 

Forcing the Distribution. — Both of these systems are intended 
to “force the distribution," that is, to force the rater to use the 





MEASURES OF JOB SUCCESS 


33 


Name Deot 

Lost First 

Job Performance 

How satisfactory is this employee in the performance of his 
present Job ? 

NOTE Consider such items as general productivity! 
Including quality and quantrly of work, specific, 
lob knowledge, safety, induslry, dependobilily, 
cooperation, perseverance 

Number of employees ranked — 

This employee's rank (No t is high) 

Leave blank 

Date Ranked by 


Fio. 3-O^^mplc form for use iu rating employees by Iho ranking method, 

TABLE ni.-rOR CONVERTING EMPLOYEE RANICS TO A 
PIVE^POINT SCALE 



whole range of the scale instead of some particular part. Without 
having to interpret such words as “excellent” and “poor,” the 
supervisor identifies the best 10 per cent now in his department, 




































84 PBINOIPLES OF PERSONNEL TESTING 

the poorest 10 per cent, eto. The spreading of the distribution is 
important because if all employees are rated the same or nearly 
the same, there is no differentiation, and the ratings m’o useless for 
test-validation purposes. 

Pooling Judgments. — ^The general rule is that pooled judg- 
ments are superior to individual judgments, that the average rat- 
ing of two judges is superior to the rating of cithei' alone. This 
is true, however, only when both jtidges are egtmlly qualified to 
evedmte the employee. So often in industry to secure an addi- 
tional judge it is necessary to move to tlie next higher level of 
supervision, in which case it is almost invariably true that the 
additional judge knows less about the employee’s day-to-day 
competence. There are instances, however, where several super- 
visors are equally qualified to judge. For example, if employees 
rotate from shift to shift and supervisors do not, each supervisor 
has presumably equal opportunity to observe. Considerable 
judgment must be exercised by the personnel man in deciding 
what ratings to use. It is never wise to overlook completely the 
general competence of the supervisor in deciding whetlier or not 
to use his ratings. 


JOB SAMPLES 

Need for Job Samples. — Occasionally there oro job dassifica- 
tions for which neither production nor personnel data ore avail- 
able and for which judgments by supervisors or others are not 
dependable. Many inspection types of jobs fall in this particular 
category. Since, as a general rule, the inspector is the last em- 
ployee to handle the product, it goes on directly to the customer. 
If the inspector makes a mistake, the only way in which it is 
known is through a customer complaint. Very often the com- 
plaint cannot be traced to the person passing the defective or 
faulty item. In this kind of situation, supervisors have little or 
nothing to go on. They do not possess information that they can 
use in intelligent rating. Tiffin and Rogers' assorting room study * 

* Tiffin, Joseph, and Roobub, H. B. The selootion and tnuning of Jnspootore. 
Pmonnel, 1041, 18, 14-Sl. 



MEASURES OF JOB SUCCESS 35 

is a case in point. In seeking a criterion against which to validate 
test scores they secured supervisory ratings. These ratings 
showed virtually no relationship with the tests used. Later, the 
operators who inspected tin plate for defects and sorted it were 
each asked to sort a standard stack, each sheet of which had been 
previously coded as “satisfactory” or “defective.” Each girl was 
timed, and each was given a percentage accuracy score. The 
average time and accuracy scores for the twelve girls rated best 
by the supervisors were compared with those of thirty-eight girls 


High raM gfrU 
Randomty sotactad girls 


High rated girls 
Rgndomly selected girls 




203 


} 


Minutes 

to 

sort 




|73.7%"v 

|77l%-^' 


Accuracy 

percenloge 


Fia. S-lOr— PerformancQ of high rated sorters and randomly selected sorters on 
tiii-*platc job samples. 


selected at random, and the results are presented in Fig. 3-10. 
Although tire girls rated superior were slightly faster, requiring 
an average of 15.1 minutes as compared with 20.3 for the ran- 
domly selected girls, their accuracy was slightly lower. In other 
words, in the absence of other facts upon which to base judgment, 
the supervisors had evidently selected as superior those girls who 
handled tlie most metal in a day’s time. 

Nature of Job Sample. — job sample is a portion of the job 
standardized in such a fashion that everyone performs the identi- 
cal task. In the case cited above the same 150 sheets of tin plate 
sorted by each operator constitute a job sample. Another job 
sample reported by Lawshe and Tiffin ^ involved the performance 
of 200 inspectors with those instruments in a list of twenty that 
they were called upon to use in their jobs. Figure 3-11 sho\ra the 

^ Lawbhd, C. H., Jn., AND Tiffin, Josbpk. The aoouriiey of pieoision iostru- 
mont inapootion. J. appf. FsvcAoI., 1846, 20, 413-410. 


36 PBINGIPLES OF PERSONNEL TESTING 

percentage meeting the standard on eleven of the twenty job 
samples. Here objective measures of an employee’s ability to do 


mlcromilir 
±0001 N*I6S 

2* Vkrnltr 
mtofomafer 
±0001 N>l3e 

6* Varniar 
mleramaiar 
±0001 N>I3I 

S'CRagular) 
mlcromatar 
±001 N>I46 


micramalar 
±O0) NM42 


Xmmmmmmx 


JOOl NMI7 


Inilde collpar and 
2r mlcromattr 
±JD0I NM27 

Inside caliper and 
6* micrometer 
±002 N>II2 






Outilde caliper and 
6'rula 

± 1/64 N>II7 
Vernier caliper 








TiOa S^ll^The percentage of inspectors passing and failing various prooision- 
meaBuring instrument porformance tests in an aircraft propeller plant, The solid 
bars indicate the percentage passiug, {From Lamho and Tiftn,) 

his job or at least one aspect of it, while obtained for another 
purpose, could easily be utilized for segregating these employees 
into two groups for purposes of test validation. Whenever a po]> 






MSAS0RE8 OR JOB SUCCESS 37 , 

tion of the job can be pulled out and set up as a standardized task 
that everyone can be asked to perform, it can be thought of as a 
job sample. 


THE SINGLE VARIABLE 

Learning Time. — On page 17 in Chap. II, reference was made 
to the need for controlling tlie experience or learning factor in 
selecting criterion groups. If some of the employees have not 
been on the job long enough to learn it, they will probably have 



Fia« 8-12v-*Curvo ffhowing trend of average merit-rating points for 1|000 steel 
mill employees by amount of Bcrvioe on the job. 

low production or be rated low by supervisors. This, of course, 
places them in the poor group, when in reality there is no way of 
determining at that time whether they will actually be poor or 
good once they have had the opportunity to learn the job. As 
pointed out earlier, it is sometimes desirable to construct learning 
curves similar to I^g. 2>3 in order to get an idea of the length of 
the learning period. 

The same phenomenon sometimes appears in connection with 
merit ratings. Figure 3-12 shows the average merit-rating points 
for one thousand employees in a steel mill by months with the 
company. It is quite evident that the tendency is for new em- 
ployees to be rated lower than old employees. Whether the older 
employees ore really better or supervisors only feel that they are 
better cannot be decided from these facts. However, the point 
is that in setting up criterion groups based upon ratings, it will 
quite likely happen that the good group would tend to have a 



38 PRINCIPLES OF PERSONNEL TESTING 

preponderance of older employees, whereas the newer group would 
tend to have a preponderimce of newer employees. If this is tlie 
case, it is usually desirable to eliminate from the study those em- 
ployees having extreme amounts of experience. In the case cited, 
one would be reasonably safe in eliminating all employees with 
fewer tlian twelve months’ experience or more tlian twenty-four. 

TABLE IV.— MEAN TEST SCXDIIES OP PllEIGIIT SOLICITORS 
DIVIDED INTO RATING GROUPS 


Rating 

All Bolioitors 

After olinitnaiion 

N 

M(Min 

N 

Mean 

A 

21 

“168 

16 

-168 

B 

22 

-14S 

17 

-.134 

C 

12 

-163 

6 

-125 


If he did this, he could be certain that such differences as he might 
find in test scores between good and poor employees were not 
assooiated in any way with experience on the job. Likewise if he 
found no differences, he could be certain that true differences 
were not being mnslcod by the experience factor. 

Age and Other Factors. — In addition to experience, such 
factors as age, sex, and color should be controlled by eliminating 
a sufficient number of cases to moke the criterion groups com- 
parable. A case involving railway freight solicitors shows how a 
number of variables can mask the true facts. Forty freight solici- 
tors who had been rated os A, B, and C were given the Bem- 
reuter’s Peraondlity Inventory and their mean scores on the 
“neurotic tendency” (Bl-N) component are shown on the left 
side of Table IV. Systematic age and experience factors were ap- 
parent, however; and when all solicitors with less than two year’s 
experience and all who were over fifty years of oge were elimi- 
nated, thirty-seven remained, and their mean scores are shown on 
the right side of Table IV. Note that although no important 
differences are apparent when all ore considered, there is a sys- 
tematic trend to the averages when the select group is used. 



MEASURES OF JOB SUCCESS 


89 


SUMMARY 

Selecting groups of employees that ore relatively good and poor 
is one of the most important steps in the test- validation pro- 
cedure. Generally speaking, the nature of the criterion of job 
success varies with the nature of the operation, the policies of the 
management, and the availability of facts. The four types of 
criterion data are production data, personnel data, judgment of 
others, and job sample performance. 



CHAPTESR IV 


METHODS FOR ANALYZING AND 
PRESENTING FACTS 


Throughout Chap. 11 the importance of “testing the lost** is 
emphasized, and in the discussion of the five-step procedure 
reference was made to the comparison of the test results of the 
two criterion groups of employees. There are four basic ap- 
proaches to the study of relationships between tost scores and 
measures of job success: the scattergram, the method of averages, 
the method of percentages, and the profile method. Each of these 
has variations that may bo adapted to the needs of particular 
problems. In order to illustrate the application of these several 
techniques, data collected in conjunction with the trade training 
program^ for electricians mentioned earlier were used. Two 
hundred trainees were administered nine different tests prior to 
their admission to a fifteen weeks’ training program. At the dose 
of the program, each received on achievement grade somewhere 
between 2.0 and 4.0, the latter being the maximum that could be 
obtained. Test scores and grades in the training program are 
used here to illustrate the major methods of anaJysiB and inter-* 
pretation. 


THE SCATIERORAM 

Two Variables. — Any time that two variables (that is, test 
scores and some measure of job success) arc being studied, some 
estimate of their relationship can be made by preparing a graphic 
picture of this relationship, known as a scattergram. As is shown 
in Fig. 4-1, the vertical axis (called the ordinate) is used to 

* IiAwaiin, 0. n., Jr , ano T]{orntqh, fl. R. A tent battery for idontifying 

potentially Booccnful naval cleotrical trainees, J. appi, PiychoU, 1043 , 87 , 300 -> 
406 . 


40 



42 PRINCIPLES OP PERSONNEL TESTING 

describe one variablcj in this case scores on n test. The horizontal 
ajcis (called the abscissa) is used to represent the other variable, 
in this instance grades in a training program. Each individual 
is then represented by one dot, the position of the dot being located 
by that person’s tost score on the vertical axis and by his grade 
in the training program on the horizontal axis. When all of the 
individuals in the study are plotted in tliis fnsliion, the resulting 
scattergram indicates the degree of relationsliip between the two. 

Examples. — The two examples in Fig. 4-1 represent the same 
group of individuals. They show the relationship of grades in 
tlie electrician’s training program and each of two tests taken 
before training began. 'The scattergram on the left in Fig. 4-1 
shows the relationship between spelling-test scores and grades 
made later in the training program. In contrast, the scattergram 
at the right allows the relationship between scores on the Purdue 
Industrial Training ClassifuMtion Test and gi'adcs in the training 
program. It is evident that there is a closer relationship in the 
case of the latter, since the scatter of dots more nearly approaches 
a straight line. In tlie former the pattern more nearly approaches 
a random scatter or a circle. 

Correlation. — ^These scattergrams aro useful when inspected 
visually. As in the case of the examples in Fig. 4-1, when one test 
is highly related to the criterion and when the other bears little 
or no rdationship, it is relatively simple to look at the patterns 
and determine which of the two tests would be most useful for 
future selection. Sometimes, however, the differences ore not so 
apparent as in the present examples. Statisticians have devised 
ways of expressing the degree of relationship in terms of a single 
number called the confident of correlation^ Although the treat- 
ment of this concept is beyond the scope of this book, a few words 
seem in order. The coefficient of correlation describes the degree 
of relationship between two sets of values. A perfect positive 
relationship is described by a coefficient of 4-l>00, and a perfect 
negative relationship by a coefficient of — 1,00. In terms of the 
examples presented in Fig. 4-1, if a perfect positive correlation 

1 Sco itny elemonlary textbook in statmlics such os LiNnqviST, K F, A first 
course m statisUes, Boston; Houghton Mifllin, 1043. Pp. 63^204. 



METHODS FOB ANALYZING AND PBE8ENTING FACTS 43 

(+1.00) were to exist between scores on the Purdue IndustriaL 
Training Classification Test and grades, the dots would all fall 
into a straight line and it would be possible to predict exactly the 
school grade from the test grade in advance. That is, the trainee 
having the highest score on the test would also have the highest 
grade, and the trainee making the lowest score on the test would 
have the lowest grade, and so on. If, on the other hand, there were 
a perfect negative rdationship ( — 1.00), it would be possible to 
predict the grade just as accurately from the test score but the 
person with the highest test score would have the lowest grade 
and the person with the lowest test score would have the highest 
grade and so on. A coefficient of 0.00, indicating no relationship 
at all, would indicate a random scatter of dots and that an indi- 
vidual having a high test score would be no more likely to have a 
high grade than would one with a low test score and vice versa. 
Two variables having a degree of relationship anywhere between 
these two extremes of no relationship and perfect relationship 
would yield a coefficient of correlation somewhere between 0.00 
and 1.00. In the present examples, the correlation between 
the spelling teat and grades was computed to be 0.26, whereas for 
the Purdue Industrial Training Classification Test it was found 
to be 0.71. It should also be added that when the coefficient of 
correlation is known, the value of one variable may be estimated 
from the other; but as the correlation approaches 1.00, the sise of 
the error will diminish until, if there were a correlation of I.OO, 
there would be no error. As the corrdation approaches 0.00, the 
error of estimate or prediction increases until, when the correla- 
tion is exactly 0.00, the error is as great as it would be if the value 
of the second variable would always be estimated at the average of 
the distribution. 

Usefulness of Correlation. — The coefficient of correlation is 
extremely useful in the hands of a skilled technician who is ex- 
perienced in its use. Its value is limited largely to the determinor 
tion of the degree of relationship. It is extremely difficult to 
present and interpret to managerial, supervisory, and union per- 
sonnel not experienced in the use of statistical concepts. Although 
the soattergram is useful in presenting and interpreting facts about 



PRINCIPLES OF PERSONNEL TESTING 


44 

teats, the person in charge of the program will do well to avoid 
reference to the coefficient of correlation when making presenta- 
tions, either verbally or in writing. One or more of the following 
approaches is ordinarily more useful. 

THE METHon OF AVERAGES 

Simple Averages.— As suggested in Chap. II, one of the most 
understandable and consequently more useful methods for com- 
paring two criterion groups is the method of simple averages. The 



Low High Low Hfgh 

half half half holf 

SPELLING TEST ITC TEST 


Fin. i-Sj—Avcnxf'o courao srarlon for olootriciil trainocs whan divided into high 
and low halves on two tosta* 

mean (commonly called the average or arithmetic average) is 
used. In Fig. 4-2 tlie same data used above were employed. (1) 
The 200 men were classified into two groups, the 50 per cent 
making tlie highest scores on the spelling test and the 50 per cent 
making the lowest scores on the some test. (2) The mean 
(average) grade in tlie training program for each of the two groups 
was computed; the pair of bars at the left of Fig. 4-2 sliow these 
means of 3.1 and 3.2. The same process was repeated with the 
Purdue Industrial Training Classification Test, and the resulting 
means of 3.0 and 3.3 arc shown at the right of the figure. It is 
apparent that the obtained difference ^ between the two groups 

^ The question of how great must a diiToreneo bo boforo ono can bo reasonably 
oertain it could not have oocurred through chanco is disoiissod in Appendix A. 



METHODS FOR ANALYZIHG AND PRESENTING FACTS 46 

is greater in the case of the Purdue Industrial Training Classifiech 
Hon Test than in the case of the spelling test. Actually this is 
only another way of expressing the some relationship that was 
illustrated in the scattergrains in Fig, 4-1. Another analysis by 
means of simple averages could be made by dividing the trainees 
into two groups on the basis of grades and computing the average 
score for each lest for each group. There are times when the 
criterion is of sucli a character that this latter approach is neces- 
sary, for example, if employees are rated as either A or B and only 


M 



20 % 20 % 20 % 20 % 20 % 20 % 20 % 20 % 20 % 20 % 
SPELUNG TEST ITC TEST 

Fio. 4-3^— Successive average course grades of trainees divided into five de- 
grees of cxoellerrco on the basis of two different tost scores, 

two classifications exist or when a comparison is made between 
the scores of discharged and satisfactory employees. 

Successive Averages. — ^The method of successive averages is 
identical with the method of simple averages except that the em- 
ployees are classified into more than two groups for purposes of 
comparison. In Fig. 4-3, the same facts previously used are em- 
ployed. However, instead of dividing the trainees into the 50 
per cent doing best on the test and the 50 per cent doing poorest 
on the test, they have been classified into five groups, the best 
20 per cont on the test, the poorest 20 per cent on the test, and so 
on. The average grade of each successive group was computed, 



46 PRINQJPLEa OF PBRSONHEL TESTING 

and the two graphs in Fig. 4-3 were prepared. Here again it is 
evident that the relationship between Purdue Industriod Training 
Classificatim Test scores and grades is much more marked than 
is the case with the spelling test. The method is also applicable 
where employees have been rated into three or more groups. The 
average test score of each group can be computed. 



Pra OiiTO fihotring moan aourao grades of vnrioiifr }>roi>ortions of Iho 
group sclcatcd on tUo bn^iR of highest porfornumcn on Lho Hpolling tesb. 

Cumulative Averages. — ^Tho method of cumulative averages, 
though occasionally more useful, is sometimes more difficult to 
explain and interpret to others. In tlie data that have been used 
here for illustration, it is known that tlie whole group had an 
average grade of slightly more than 3.1, In deciding whether or 
not to use one of the tests in question for future selection, one 
might well raise the question, “ What would the average have 
been if only the 50 per cent doing best on the test had been 
admitted to the program?’' Figure 4-4 answers this question 
with respect to the spelling test. Generally speaking, the smaller 
the percentage (the best 10 per cent, the best 20 per cent on the 
test) admitted, the higher the average grade of the group. It 
will bo noted, however, that the percentage admitted would need 
to be quite small to result in any appreciable improvement in the 
average of tlie accepted group. Figure 4-6 presents the same facts 
for the Purdue Industrial Training ClassificaUon Test, Here the 
upward trend of the curve is more marked and more consistent. 
A comparison of the two figures certainly indicates the superior- 



METHODS FOR AHALYZING AHD PRESENTINQ PACTS 47 

ity of the Purdue Industrial Training Classification Test for the 
selection of these particular trainees. The method is called the 
method of cumulative averages because in the process of com- 
puting the various averages, a cumulative procedure is helpful. 
The same approach may also be used by allowing the horizontal 
axis to represent minimum teat scores. When this is done, the 



!Fia. 4n5j~Ctirvo showing moan oourse grades of various proportions of the 
group solcctcd on the basis of highest performance on the Purdue Industrial 
Trainifig Claaai/icalion Teat, 

curve answers the question, "What is the average grade of those 
who made a certain score or higher?" 

METHOD OF PERCENTAGES 

Simple Percentage8.~Frequently data being analyzed are of 
such a nature that it is either impossible or undesiraUe to work 
with averages. In these instances it is possible to work with 
the percentage who attain a certain standard or meet a certain 
condition. Using the some facts as in the earlier examples, tiie 
question can be asked, "What percentage exceeds the average 
grade of the total group?" The answers are shown in Fig. 4-6. 
The trainees were classified into the best half and the poorest 
half on each test, and the percentage exceeding 3.1 (the over-all 
average) computed. The same trends that have been noted be- 
fore still appear, and an evaluation of the two tesis can be made 
in terms of the differences in percentages. 

Figure 4-7 expresses these facts in a dightly different fashion. 
Here the total length of the bar represents 100 per cent of the 


4g PRINCIPLES OP PERSONNEL TESTING 

group in question. The open seotion of the bar, like the bars in 
Fig. 4-6, represents tlio iwrccntage attaining the standard or meet- 
ing the condition; the shaded portion of tlie bar represents tlie 
percentage failing to meet the standard or condition. 


66% 



56% 



SPELLING TEST ITC TEST 

I'm. 4-6.— Fcrcentneo of trainoea' gtndoa oxoeoding 3.1 in high and low ecoring 
half of each of two teats. 


SPELLING TEST 

High scoring hoV 
Loiif scoring 



158% 


ITCTCSr 

High scoring holf 
Urn scor-* ^ 
Inghalf 



70 eb 90 w ^ lb 6 o 2:0 ab 40 ^ 

Per eeni 3.1 and below Per «iaI ebew 3.1 

Im. 4-7r— Feroontago of tmincea receiving grades abovo and below critical 
score when divided into high- and low>sooring half on oadi of two tests. 


Successive Percentages.— The method of successive percent- 
ages is actually only a modification of the simple percentage 
approach. Instead of classifying tho individuals into two groups, 
they ore placed in three or more groups and the percentages ore 
computed for each group. Figure 4-8 shows the group classified 




METHODS FOR ANALYZING AND PRESENTING FACTS 49 

into fivG cSitcgorios on tho b&sis of spoUing-tost porfonnoiiicc nnd 
into five categories on the basis of Purdue Industrid Training 
Classification Test performance. In eacli instance, the percentage 
of each subgroup that received a grade of 3.2 or better is shown. 



LOW Niat Mkldlt NmI Hlflh 
20% eo% 2Q% 20% 20% 

8PELLIN0 TEST 



Low Next MIddte Next HtQti 
£ 0 % 20 % 20 % 20 % 20 %. 


ITC TEST 


Fifl. 4-8. — ^Feroentage exceeding coiirae grade of 3,1 when divided into five do- 
greea of excollenco on tho baaib of each of two different testa. 



Fig. 4-0.— Fercontage exceeding grade of 3.1 in course when varying propois 
tiona are selected on the basis of highest porfonxianco on spelling test. 


The greater relationship in the case of the classification test is 
again evident. 

Cumulative Percentages.— The method of successive percent- 
ages indicates the percentage within each test group that attained 
a given standard. Suppose tibat the best 30 per cent, the best 60 


50 PRINCIPLES OF PERSONNEL TESTING 

per cent, or some other proportion had been accepted. What per- 
centage at the various levels would have attained tlio standard? 
The method of cumulative percentages answers this question, and 
Fig. 4-0 sliows its application to the spoiling test data. Here it 
will bo noted, the proportion receiving grades of 3.2 or better is 
not materially altered by taking the upper 30 per cent. Figure 



High scoring porcontoge 

4-10, — ^Pcrconfaigf* excroding grnrlo of 3,1 in conriio whon vniying proper* 
tiona aro anlcctod on the basis of highest porformnneo on Iho Purduo Indu^lnal 
Tramiixg Clamficaiion Tc^L 

4-10, however, treats the Purdue Indualrial Training CUmificoh 
lion Teat facts in tlie some fashion, and it is again clearly ap- 
parent that this latter test is superior for this purpose. Generally 
{peaking, the smaller the proportion selected in terms of test su- 
periority, the greater the improvement in the percentage of 
trainees who receive grades of 3.2 or better. 

THE PROFILE METHOD 

Meaning of Percentile Scores. — ^Percentile scores are useful 
for comparing performance on two or more tests that have dif- 
ferent numbers of items and different levels of difficulty. For 
example, in tile cose of tlie two tests used for illustrative purposes 
in this chapter, one has a maximum score of 23 whereas on the 
other some trainees score as high as 98. How are comparisons to 
be made from test to test? What score on the spelling test repre- 



METHODS FOB ANALYZING AND PBESSNTINO FACTS 61 

sents about the same level of accomplishment as a raw score of 
20 on the Purdue Industrial Training Classification Test? Per- 
centile scores provide one method for answering these kinds of 
question. The percentile score equivalent of a raw score value 
indicates the percentage of individuals in a defined group who 
scored at that score level or below. For example, if a given raw 



Fio. 4-ll^Fereentilo graph ehowing standing of one trainee on. eadi of nine 
testa. 


score falls at the eightietii percentile, it means that 80 per cent of 
the group made that score or a lower score; the converse is that 
20 per cent did better. The fiftieth percentile, then, always 
defines tihe mid-point, or the score on the test ihat was exceeded 
by half the people. The twenly-fifth percentile and the seventy- 
fifth percentile indicate the raw-score limits between which the 
middle 50 per cent of the scores lie. 

Individual Profiles. — With this concept, then, it is possible to 
administer a battery of several tests to a group of employees or 
applicants and then to convert each individual's raw score on 
each test to the corresponding percentile value. These percentile 
scores are then plotted on a chart known as a profile for visual 
comparison. Figure 4-11 is on example for one electrician trainee 



62 PRINCIPLES OF PERSONNEL TESTINQ 

and shows his telntive standing on each of the nine tests tcJcen. 
His standing on both tlie electrical information test and the 
arithmetic test is at the seventieth percentile, indicating similar 
performance on the two tests. The profile indicates that he is 
above average on all but two of the tests and that he did rela- 
tivdy best on the Adaptalbtlity Test and poorest on Uio radio ap- 
titude test. 



Fiq. 4-’12^Poi'OQtitilQ graph idiowing Iho mean profilo of tlio fourUi of ilio olosa 
with highest course grades and Um fourth with tho lowest conrso grades. 


Group Profiles.— Just as a profile can be prepared for an in- 
dividual, BO can one be prepared to demonstrate tho average 
performance of a subgroup. ITrequently it is desirable to plot the 
profiles of two oriteriou groups of employees, one that is meeting 
the standard of job success and one that is not. For example, in 
the electrician trainee study two criterion groups were identified. 
The 25 per cent who made the best grades in the school were colled 
''good”; the 26 per cent who made the poorest grades were called 
"poor.” A comparison of the profiles of these groups makes pos- 
sible the evaluation of the several tests involved. For example, 
if on a certain test the percentile position of the average score of 
the good group is no hi^er than the corresponding percentile 



METHODS FOR AHALYZINQ AND PRESENTING FACTS 63 

score of the poor gi'oup, the tost does not discriminate. Figure 
4-12 siiows the profiles of these two groups. It is apparent that 
the tests are not equal in the degree to which they discriminate 
between the criterion groups. The radio aptitude test, for exam- 
ple, shows practically no difference, whereas the Purdue Industrial 
Traininff Classification Test shows considerable spread. Once 
such a group profile has been used to indicate what tests in the 
battery are likely to be useful, it can be used as a standard against 
which to match individual profiles like the one in Fig. 4-11. It 
provides a basis upon which to judge the extent to which an in- 
dividual pattern corresponds to the pattern of the superior group. 

SniCMARY 

Data from an electricians* training program have been used to 
illustrate the four basic methods for analyzing and interpreting 
the relationship between test scores and measures of job success. 
The methods discussed, the scattergrain, the method of averages, 
the method of percentages, and the profile method, each has minor 
variations, one of which will satisfy the demands of each specific 
problem. 



GlIAFTEli V 


MENTAL ABILITY TESTS 


The Meeming of Mental Ability. — Mental ability or intel- 
ligence is more easily measured than it is defined. This may 
sound strango until one considers the fact that physicists measured 
electricity long before they were able to understand its nature. 
Altliough tliero is still some disagreement among psychologists 
regarding the nature of mental ability, tliere is general agreement 
that mental ability is reflected in "quickness on the trigger,” 
ability to learn, versatility, and general competence. Some people 
learn new taslcs more quickly than others. Some can figure their 
way out of problem situations more readily than otliors. Some 
are capable only of relatively unchanging, repetitive activity; 
otlici'S oi’c extremely versatile in the sense that they ai'e able to 
adjust tlicmselves to situations which require the rapid shifting 
of attention among a variety of different stimuli and the making 
of certain judgments regarding tlieir relationships. Some people 
are simply more competent than othci's in a general way, not alone 
because they possess specific skills or know certain facts but be- 
cause they are capable of meeting situations that many times 
require the manipulation of abstract ideas or concepts. In a sense, 
mental ability represents a kind of mental "horsepower” rating, 
and some activities or jobs mquire moro horsepower to get them 
done than do others. 

Primary Mental Abilities. — ^The research conducted by Thur- 
stone and others indicates that this competency which charac- 
terizes the behavior of so-called "more intelligent people” is not 
a single trait or ability but, more accurately, is a combination of 
several quite specific abilities. Researchers disagree regarding the 
extent to which these primary abilities are interrelated ; but when 

i TBusaroNiQ^ L. L. The vectors oj mind^ Chicago; Umvensity of Chicago 
ProBS, 1036. 

M 



MENTAL ABILITY TESTS 


55 

these traits are taken together without regard to the exent of 
interrelationship, most experimenters include them in their con- 
cept of intelligence/ Below is a list of some of the abilities that 
have been identified by Thurstone and others. 

1. Verbal exility, reficcted in facility with words and language. 

2. Numerical ability, required in the simple arithmetic opera- 
tions but not in the more complex reasoning types of situa- 
tions. 

3. Memory ability, cliaracterized by the recall of recently 
learned, rote memory material. 

4. Visualizing ability, required in the performance of tasks in- 
volving space relationsliips. 

5. Mental fluency, required in the making of rapid responses or 
adjustments to abstract tasks. 

6. Perceptual speed, required in the rapid identifi-cation of dif- 
ferences in visual patterns. 

7. Inductive ability, required in the discovery and application 
of some rule or principle that is operating in a situation. 

8. Deductive ability, representing what is most often popularly 
referred to as reasoning ability. 

All present-day tests of mental ability are made up of questions 
that measure several of these various abilities. The various tests 
differ primarily in the emphasis that is placed on each and the 
nature of their organization. 

Kinds of Tests. — Some mental ability tests have subparts 
that ore intended to measure some of these specific abilities or 
combinations of them; others are made up of questions that 
sample these various abilities in a more or less random fashion. 
In business and industry the latter has, generally speaking, been 
most useful, although there are outstanding instances in which the 
more specific teats have done exceedingly well. One common 
classification that has been employed by test makers has resulted 
in the construction of language and nonlanguage tests, the latter 

^ For a comprehenflive and critical review of the literature see CattkIiL, 
XUyuokd B» The meaBuromeut of adult intelligoiico. Psychol. Bull., 1943/ 40, 
153^193. 



66 PRINCIPLES OF PERSONNEL TESTING 

containing only tliose questions which are not dependent upon 
language mastery. Tests may bo administered to only one person 
at a time, or they may be of tlie group variety with which the 
number that can bo tested at one time is limited only by the testing 
facilities. The latter is used almost exclusively for personnel 
placement. 

Ability Differences and Their Origin. — ^It is well known that 
individuals differ markedly in any single one or combination of 
the specific abilities listed above. Because the term mteUigenee 
was adopted before the more recent investigations were made, 
there has been a widespread belief or, at least, an implication 
tiiat intelligence or mental ability is innate and that it is relatively 
independent of the various environmental influences with which 
an individual comes in contact. In fact, certain psychologists 
have defined intelligence as just that. It is difficult, however, to 
consider a concept of this natm'e which is not measurable, and 
purely innate abilities do not lend themselves to measurement. 
An individual may be given verbal or nonverbal tasks to perform, 
and his ability to perforin them can be measured. He has certain 
amounts of certain abilities now. The fact can be verified, but to 
speculate regarding the relative contribution of heredity and 
environment to his present status leads to highly theoretical con- 
siderations which are extremely difficult to investigate. Although 
such matters ore important from tho standpoint of social welfare 
and educational philosophy, they are quite unimportant in the 
matter of personnel placement. If a given test, by one or more 
of tlic simple techniques presented in Chap. IV, has been shown 
to be related to some measure of job success, the question of how 
the particular employee or applicant got that way is unimportant. 
Many psychologists have adopted the definition that mental 
ability is that ability or those abilities which mental ability tests 
measure. This is not a facetious statement and is actually the 
most meaningful of all definitions. As a matter of fact, the many 
tests of mental ability yield markedly consistent results. In fact, 
they are more in agreement than ore the people who make them 
when they start discussing such questions as heredity vs. environ- 
ment and one vs. many traits. People differ in their abilities to 



MENTAL ABILITY TESTS 


57 

perform certain tasks in tests and elsewhere; these abilities which 
they possess now are actually the product of eiivironment and 
heredity. The important fact is that, provided a particular test 
correlates with success on the job, how an individual got that way 
is relatively unimportant. 

Inequalities in Opportunity. — Some of the more recent re- 
searches have indicated that mental ability test scores are 
markedly affected by the educational and the sodoeconomio 
opportunities that an individual has had. For this reason, the 
statement is sometimes heard that a given individual should not 
be penalized in his test score because he has never had the oppor- 
tunity to develop the skills or abilities tested. Again, it should be 
considered that if test scores are correlated with job success, how 
he got that way is unimportant from the standpoint of industrial 
placement, important as it may be from the social point of view. 

The I.Q. and Other Scores. — Everyone has heard of the I.Q. 
as an index of a person’s performance on a mental ability or intel- 
ligence test. The I.Q. is the intelligence quotient. It is a quotient 
representing the ratio of an individual’s mental age to his chrono- 
logical age. It was developed for use with maturing children 
and actually yields the ratio between a child’s mental develop- 
ment and his chronological development. Thus a child of any 
age, who has matured mentally at the same rate as the average 
child his age is given a quotient of 100. If his mental development 
has been more rapid than the average, he has an I.Q. greater than 
100; if his mental development has been arrested or has been 
otherwise slower than the average for his age, he has an I.Q. 
smaller than 100. More than half of the children of any age group 
have I.Q.’8 between 90 and 110. Since the I.Q. is a ratio, and since 
the kinds of ability that intelligence tests measure generally cease 
to improve after late adolescence, the I.Q. is much less meaningful 
at the adult level than it is during childhood and adolescence. 
The authors of tests using the I.Q. sdeot some age usually between 
sixteen and twenty as the maximum chronological age to be em- 
ployed in computing the ratio, regardless of the individual’s true 
age. 

Tests of mental ability that ore constructed primarily for adult 



58 pnwaiPLES OF PEnSONNEl TESTINO 

use in industry usually employ the raw score, or number of ques- 
tions answered correctly, in prcfercnco to the T.Q. Those, in turn, 
are usually converted to percentilo values^ so that it is possible 
to say that insofoi' os the abilities moasurcrl by a particular test 
are concerned, a given individual is in the upper or the lower so 
many per cent of applicants or present employees. 

MBtTTAL ABILITY AKD JOB PLACEMENT 

Occupational Groups and Ability Levels. — ^Thc mental ability 
levels of individuals in various occupational groups differ. 
Harrell and Harrell® present data accumulated during the Second 
World War, and Fig. 5-1 prepared from data in their report lists 
seventy-five civilian occupations in descending order of average 
or mean scores on the Army General CloMification Test of 18,782 
soldiers engaged in these occupations prior to service. The shaded 
bars mark off the test score limits within whidi approximately 
two-thirds of tlie men in each classification scored. Thus, men 
who were accountants prior to induction averaged 128 points in 
contrast to men who were teamsters, who averaged 88 points. 
Two-thirds of tlie former scored between 116 and 140 whereas 
two-thirds of the latter scored between 68 and 107. Although 
certain difficulties arc always encountered when job titles ore dealt 
with, nevertheless the figure serves to demonstrate that occupa- 
tions tend to attract and to hold individuals in a relatively narrow 
mental ability range. This finding is corroborated in many other 
studies one of which was conducted by Pond.* In an investigation 
involving 0,026 employees in forty-four different occupations she 
found that the average or mean mental test score of employees on 
each occupation correlated 0.77 with estimates of the intelligence 
required for these occupations. Bills * in a study of 780 clerical 

1 Seo Chap. IV, p. 60. 

^ HariusUi, Thomas W., and IlAmnsTiii Maroaret S. Army gonoral claesi-* 
ficatioQ test aoores for oivilinu occupations. J, oduo, and Psychol, Mcas., 1045, 5, 
220-230. 

■ Pond, Milijcmnt. OccupationR, mtolligonco, rro, and school ing: thoir mla- 
tionship and distribution in factory population. Parson, J., 1083, 12, 378*882, 

* Pond, Mii.ucbnt, and Marion A. Iiilolligcnco anil clerical jobs;, two 
Btudiea of relation of test score to job hold. Person, J,, 1033, 12, 41-50, 



MENTAL ABIIJTY TESTS 


SQ 

workers in an insurance company classified the jobs into eight 
grade levels ranging from A to H with the A classification repre- 
senting the lowest level of job. Table V shows the proportion in 
each mental ability score bracket in the various classification 
levels. For example, whereas 82 per cent of those scoring zero to 
40 on the teat were in A or B jobs, only 26 per cent of those scoring 
140 or better were on these jobs. Table V shows consistent trends 


TABLE V 


Test scores 

Job-clossificntion levels 

1 

A and B 

CandD 

E and F 

O&ndll 

0-40 

82 

9 

0 

0 

41-60 

77 

16 

7 

0 

61-80 

71 

24 

0 

0 

81-100 

01 

27 

12 

0 

101-120 

43 

34 

20 

8 

121-140 

32 

32 

26 

10 

141 aiid nhovc 

26 

30 

35 

0 


throughout. Bills also conducted a similar study with 123 steno- 
graphic employees and obtained essentially identical results. 

Tenure and Mental Test Scores. — Studies of this character 
have been criticized in that they provide no evidence that per- 
sons at these respective mental ability levels are necessarily best 
qualified for the job. Does more or less mental ability than is 
possessed by tlie mine-run of those on the job ensure failure in 
that job? The answer is definitely negative. However, there is 
evidence that the probability of success is greater if one is within 
the preferred range. One of the earliest studies along this line 
was conducted by Pond ^ in which she used the fact of whether 
or not an individual either terminated or was terminated within 
a given period of time as the criterion of success on the job. Figure 
5-2 shows her findings in four female and five male classifications 
or job families. With each classification “preferred ranges” of 
teat scores were set up by inspecting the score distributions of 
those who stayed on the job a stipulated period of time or longer 

^ Pond, Milucbht. Scleotive placement of mental workers. I. Preliminary 
studies. J. Person. Ret,, 1027, 5, 346-36S. 



60 


PRINCIPLES OF PERSONNEL TESTING 


Accountonf— 

Lowyer — — — 

Enginw— — 

Public relations man 

Che mist “ — — 

Reporter — 

Chief clerk 

Teacher 

Orof tsman-^ — 

Stenographer — 

Pharmoctst 

Tabulating machine operator 

Bookkeeper — 

Manager, soles 

Purchasing agent— ^ — 

Manager, production * 

Photographer—^ — — — - 

Clerk, generol 

Clerk-typist 

Managei; miscellaneous 

Installer-repairman, TeL and Tel. 

Cashier — — — — 

Instrument repairman— 

Rodio repairman — 

Printer, Job pressman, Lithograph pressman 
Salesman 

Manager, retail store — 

Loborotory assistant 

Toolmaker — 

Slock clerk 

Receiving and shipping clerk— 

Musician — — — — — 

Machinist — 

Foreman— - 



140 


ac J. Scores 

Fio. Avorngo Army G^cnero^ CloBnflcaiion Tost flooroe nnd middle Iwo- 
oivilian occupstioxis. (From Uarroll and Harrell) 


MENTAL ABILITY TESTS 


(ZO 


61 

r^a 


Watchmaker— — 
Alrplone mechanic 
Sales clerk 


Electrlclon 

Lathe operator 

Receiving and shipping checker^ 

Sheet mefot worker •- 

Linemon, power and TeL ond Tel 

Assembler — 

Mechonic 

Machine operator 

Auto eervlcemon — 

Riveter— 

Cabinetmaker — 

Upholaterer 

Butcher — 

Plumber 

Bortender— — — — 

Corpenter, construction * 

Pipe fitter 

Welder 

Auto mechanic 

Molder— 

Chauffeur * 

Tractor driver 

Painter, generoJ 

Crane hoist operator 

Cook and baker 

Weaver — — 

Truck driver— — — — 

Laborer — — 

Barber— — — 

Lumberjack — 

Former 

Farmhand — — 

Miner — 

Teamster — 


60 


60 


too 
























kvvgbJ U 











— i-4— 










1 




















60 


60 


100 


120 


140 


GXa Scores 

thirds range of 18,782 Army Air Forces white enlisted men by seventy-five 


62 


PRINCIPLES OF PERSONNEL TESTING 


Test scores 

70 90 no 130 150 170 
60 80 100 120 140 160 IBO 


FEMALE OCCUPATIONS 

RhytKmlc feed 
mochfne oparolprs 

Mtscellonaous 
bench operators 

Automatic feed 
machine operators 

Simple feed 
machine operators 

MALE OCCUPATIONS 

Tooleetters^ hand screw 
mochine operators 

Light machine 
operotors 


Hot forgers, hordners, die 
polishers, wire machine operb^ 

Painters, masons, and 
steomfitters 




II 

1 

1 

1 

1 

1 






inspectors 



Percentage of 
terminations 

□-..1 - 

28 % 44 % 10 % 

clM 


26 % 35 % 17 % 


Q. 


33% 55% 16% 




41% 46% 32% 


□. 


35% 48% 19% 




40% 57% 30% 

□ i 

32% ^ 

a 


23% 



49 % 61 % 32 % 


fiQ, 6-2^Froforrad score ranges (solid bars) and nonprcforrod eooro ranges 
(shaded bars) for oach of nino job classiflcatlonst At iho right the vortical bar 
graphs represent tlic porcentago who ternunaied in the total group (open bars) 
among thoeo making nonpreforred scores (shaded bars) and among thoso who 
made preferred scores (solid bars). (From Pond.) 


as compared with that of those who did not remain. The solid 
bars in tlie figure indicate tho preferred ranges, and the shaded 
bars the nonpreferred ranges. With both malo and female jobs, 
there is an upper score limit for the lower level lobs, a lower score 


MENTAL ABILITY TESTS 


03 


limit for the higher levd jobs, and both upper and lower score 
limits for the middle jobs. At the right of the figure, the open bars 
show the percentage of the total classification group terminated, 
the shaded bars Indicate the percentage of those in the unaccept- 
able range terminated, and the solid bars indicate the per cent in 
the acceptable score range terminated. 

Pond’s findings were not limited to these nine classifications by 
any means. In a study ^ of 3,184 employees in sixty-five job classi- 
fications, all treated in this same fashion, she found that whereas 
48 per cent of the whole group terminated, only 19 per cent of 
those in the acceptable ranges terminated. These findings to- 
gether with similar ones from other studies serve to establish the 
fact that there is a desirable range of mental ability for each job 
classification. To place on high-level jobs applicants who have 
less than the desired amount of mental ability is to place on those 
jobs persons who cannot deliver what the job demands and so are 
released, or persons who are made so unhappy in the process of 
delivering above their normal level that they quit. To place on 
low-level jobs of a repetitive nature applicants with more mental 
ability than is required is to place people who are apt to find the 
job boring, monotonous, or uninteresting. Such people are quite 
apt to leave for better jobs or, if they do remain, to develop into 
troublemakers who ore eventually termmated. 

Personnel departments that have thought of mental ability 
tests purely in terms of identifying the high scorers have over- 
looked one of the important values of such tests. A company 
whose selection program has been based upon such a policy has 
achieved the moat from the tests only if it has a preponderance of 
high-level jobs. Since job-simplification and production-method 
trends are continually increasing tlie proportion of repetitive jobs, 
the concept of appropriate levds is becoming a more and more 
useful one. 

Measures of Job Satisfaction. — ^Mental ability is one of the 
determiners of job satisfaction. The employee who is correctly 
placed, other things being equal, is more nearly satisfied. One of 

I- Pond, Milltcbnt. Seleotive placement of metal workers. IL Development 
of scales for placement. J. Person. iZee., 1927, 5, 405-417. 



04 PRINCIPLES OF PERSONNEL TESTING 

the very early studies of job satisfaction was carried on by Scott 
and Hayes ^ before present-day measures of mental ability were 
as widely used. However, estunates of intelligence were made 
from educational records on the basis of years retarded or advanced 
in school. Figure 5-3 shows the percentage of each educational 
group that was dissatisfied in two different job classifications. In 




Retarded Advanced 

Fio. 5-3.*— Porcentngo of employees on two cliiTcrent jobs expreasinf; dcaire to 
ohango job| olaasificd by yotira of roUinlation or aeuolonitioii in aohool. (From 
8coU and Hayca,) 


the case of inspection tlio fewest dissatisfied employees were found 
among the low-ability (high-retardation) groups, whereas those 
who were retarded least had a greater tendency to bo dissatisfied. 
This particular inspection job was highly repetitive in nature, re- 
quiring a minimum of judgment and related ability. In the case 
of assemblers, who needed a fair degree of judgment, tlie picture 
is quite the reverse. Those who were most frequently dissatisfied 
were those with tlio lowest ability, and those who were least 
frequently dissatisfied were the most able. This is additional evi- 
dence that the most able are not necessarily the best on all types 
of jobs. 

Other Criteria of Job Success. — As pointed out in Chap. Ill, 
there are many measures of job success. Whether or not an 
employee is promoted to the next higher job may be one such 
criterion. In Bills’s clerical study* previously mentioned, an 
examination was made of the imroentage of employees in various 

^ Scott, W. D., and Hayus, M. H. S. Setsnee amt ommon »etu$ in toorilnnfr 
with men. Now York; Ronald, 1021, p. 78. 

* Fohd AMD BiUiS, op, eU, 



MENTAL ABILITY TESTS 


w 

test score brackets who were promoted to higher leyds of jobs. 
Figure 5-4 is based upon her findings. The figure shows that of 
those scoring 81 or higher on the mental ability test, 75 per cent 
were promoted to C level jobs or better, whereas of those scoring 
below that mark, only 36 per cent were promoted. Of those scor- 
ing above 100, 56 per cent were promoted to E level jobs or bet- 


Scores 
81 and above 

60 and below 


101 ond obove 
100 ond befow 


64% 




Per cent no! 
promoted 



44% 


W//////jyA 


Job 
better 

Per cent promoted 


Fio. — Percentage of clerical employees scoring above and below two criti- 

cal scores who wero promoted to higher level jobs. (From Pond and Dills*) 


ter; whereas of those scoring 100 or less, only 12 per cent were 
promoted. 

Wadsworth * has also found job differences in intelligence test 
scores. Like Pond, he established acceptable score ranges, most 
of which involved minimum critical scores only, because of the 
charactei' of the jobs in his industry. He made a study of the su- 
pervisory ratings of employees who were selected prior to the in- 
auguration of the testing program and compared them with the 
ratings of employees selected when tests were used to augment 
the previous selection procedure. His results are presented in 
Pig. 5-6. Without tests 29 per cent of those hired were rated by 
supervision as being problem employees, whereas only 5.5 per 
cent of those selected with the help of tests were so rated. The 
proportion of employees rated outstanding was at the same time 
increased from 22 to 33 per cent. 

Wadsworth * further reports his findings in connection with 793 

^ WADBWOiern, Gut Jb. Tests prove worth to a utility. Peram. J,, 1936, 
14. 183-187. 

^ Wadbwobth, Gut W., Jn. How to pick the men you want. Peram, J„ 1936^ 
14,330-336. 



0g PRINCIPLES OF PERSONNEL TESTING 

employees in twelve different occupations who were hired with 
the help of mental ability tests and who were later rated by their 
supervisors. Figure 5-6 from Wadsworth’s data shows the per- 
centage of "outstanding,” “satisfactory,” and “problem” em- 


Tasted 
Not tested 


6I57« 33% 



Per cant Par cent considered 
considered I satisfactory or out- 
"problems'* standing 


Fio. — ^PercciitagG considerrrt to bo problems (shaded bars), satisfactory 

(open bars), and outstanding (solid bars), among omployecs sclcctCHl by means of 
testa vs, those selected witlio\it testa. {From Wadswrih ) 


I-Q- 108% 

487% 405% 

ilOond obova (N»I54) ^ 


153% 

Sl.2% 295% 

80 to 109 (N«3I2) K 


3BA% 

52.1% 195% 

79 and below (N<207) fMUfm 

■I 

Por cant 

Per cent sotisfoctory 

"PfoWema** 

or out&tonding 


Pie* 6-6^Porco&tage oonsidercd problems (shaded Imra), entisfactory (open 
bars), and outstanding (solid bius), among employocs in three I.Q ranges. {From 
WadsvM)rth.) 


ployees for each of three I.Q. ranges. The proportion of problem 
employees ranges from 10.8 in the high-intelligence bracket to 
35.4 in the low bracket. Whereas the number of employees rated 
as satisfactory is relatively constant throughout the I.Q. range, 
the percentage of outstanding employees ranges from 12.6 in the 
low mental ability group to 40.5 in tlie high. 


SPECIFIC TESTS OF MENTAL ABILITY 

There are many tests of menial ability, most of which are good. 
Generally speaking, these various tests tend to measure the same 
abilities. Usually the decision of which test to use is made in 
terms of such factors os (1) the time required for administration, 






MENTAL ABILITY TESTS 


67 

(2) the orientation provided for the applicant in the directions, 
and (3) ease of scoring, in addition to the more technical charac" 
teristics of the test. It is impossible to mention all available 
tests ^ here. Those discussed were chosen because of their wide- 


GROUP B ( cnilclzed ) 


255%. 745% 


Wm. 


GROUP A 
(Not crMlctzed) 


906% 119.4% 




Missed 16 items [Missed 15 items 

or more 'or less 


Fig* 6-7^ — ^Pcrccntago of those missing more or less than elxteen items among 
those managers who were critioisod and among those who wore not. {From Sie-> 
vens and Wonderltc ) 


spread use in personnel offices and because validation studies have 
been published or were otherwise available. 

The Otis Self-administering Test of Mental Ability. — ^The 
''Otis,” as it is called, was used first in schools. Personnel people 
who needed a test for business and industrial use tended to choose 
it in preference to other school tests. It has had wide usage, 
usually with satisfactory results. Stevens and Wonderlic* ad- 
ministered it to 160 branch-office managers for a personal-finance 
company. These men were classified into two groups, group A 
including those managers who, according to record, had been 
severely criticised for their methods of systematizing their office 
procedures, handling details, and generally following work pro- 
cedures, and group B including those managers who hod never 
been criticized on these bases. Their performance on the Otis was 
examined in terms of the number of questions missed or omitted. 
Figure 5-7 shows that of the group with a record of criticism, 80,6 
per cent missed sixteen questions or more whereas, of the group 
not criticized, only 25.5 per cent missed sixteen questions or more. 

There have been numerous other validation studies involving 
the Otis, one of which pertains to supervisors in a textile mill, a 
report of which appears on page 172 of Chap. XII. 

^ See Appendix C for a comprehensive list of tests and publishers. 

> BtbvhnBi Sauuibl N., Am Wondbrlio, Eldon F. The relationship of the 
number of questions missed on the Otis mental tests and ability to handle office 
detail. J. appl, Psychol, 1034, 18, 364-368. 



68 PRINCIPLES OF PERSONNEL TSSTINQ 

The Wonderlic Personnel Test. — The Personnel Test is a 
twelve-minute revision of the Otis test and was designed for em- 
ployment oflSce use. It has been widely used with excellent re- 
sults. Typical are those reported by the author ^ himself from a 
study of the scores made by the outside representatives of a per- 
sonal-finance company. Figure 5-8 sliows these representatives 


14%. 


Employed one yeor or more 


65% 


Orsmtssed or left 


Per cent scoring 
below 25 


86% 


35% 

Per cent scortng 
25 or higher 


Fia, 5-8 ^Percentage of those scoring above or below a critical score on the 
Personnel Test among representatives who stayed on the job n year or more vs. 
those who did not. (From Wonderltc.) 


classified as those who stayed with the company a year or moro 
and those who did not. The figure sliows the percentage of men 
in each group who scored 25 and above and the percentage scor- 
ing below 25. Whereas only 35 per cent of those who did not stay 
with the company a full year scored in the higher bracket, 86 per 
cent of those who did stay with the company a full year or more 
scored 26 or higher. Wonderlic fuii^her reports noticeable dif- 
ferences in the average scores of 100 men selected by supervisors 
because they had on outstanding record for four years or more and 
of another group of 100 who had foiled to progress. A similar dif- 
ference was reported between the average scores of a group of out- 
standingly successful branch managers and another group of 
managers judged as outstanding failures. The test has been criti- 
cized by some because of its short time limit, but Wright and 
Laing * have demonstrated that the differences in results obtained 
in twelve minutes as compared with twenty-four minutes are neg- 
ligible. 

The Adaptability Test. — The AdaptabUity Test is newer than 

1 Wondeulk!, E. F., and Hovland, Carl I Tho personnel tost: a rcstaadardiied 
abridEemcnt of the Otis S-A Tost for business and industrial use. J. appl. 
Pmhol., 1930, 23, 086-703. 

* WraoBT, James H., and Laino, Donald M. The time factor in the admin- 
istration of tho Wonderlic personnel test. /. oppl. PtyehoL, 1043, 27, SIO^SIO. 



MENTAL ABILITY TESTS 


the other two and was designed specifically for personnel place- 
ment. It has been widely used since its publication, and much 
validation information is available. Figure 5-9 shows results ob- 
tained with eighty-eight office clerks in a paper mill.^ These 
employees were rated by their supervisors as being A (quite satis- 
factory) or B (not so satisfactory) employees. In the total group 



tho A rated (good) group when no teat waa employed and the percentage who 
would have been in tho A group had aucccasWely higher mmimum sroroa on the 
Adaplabiliiy Test been uacd to siipplemont existing hiring procedures, {From 
Tiffin and Lamhe.) 

57 per cent were rated A. However, among those who scored 15 
or higher 64 per cent were A’s, and of those who scored 30 or higher 
88 per cent were A employees. 

The test was also administered to a group of electrical 
trainees * prior to training, and the results are presented in Fig. 
5-10. Twenty-two per cent of the whole group received a grade 
of 3.5 or better fifteen weeks later. However, among those scor- 
ing 20 or better, 67 per cent made grades in this category. As 
the figure shows, there is a consistent increase in proportion of 

* TmnN, Joseph, and Lawshb, C. H, Jb. The axlaptability test: a fifteen 
minute montal alertness test for use in personnel allocation. J. appL PtychoL, 
im, fi7, 162-183. 

> Ibid. 





high-graded trainees with corresponding increases in Tninimnwi 
test scores. 

In another validation study, 240 employees selected to attend 
a two-week teletype training program were tested at the time 


MENTAL ABILITY TESTS 


71 

when training was started. On the basis of objective tests and 
instructor’s judgments they were given percentage grades in the 
course. Forty-seven per cent of the entire group received grades 
of 95 per cent or above. Figure 5-11 shows the percentage falling 
in this grade bracket when successively higher critical scores on 
the Adaptability Test are applied. For example, when only 


Quit within 
2 months 


Otschor^jas 


Most aoesptabla 
ronoe 



1 

\ 



0 oj 

o o 

o 

o o o 

o o| 

1 1 

O C 0 0 0 0 


1 1 
1 

i o 

O 0 


i * 

•• 

or more 

I; too 



I* , • • • 

j 

T r"i' i"r 1 

I 0 • 0 

1 

1 

■T“TT ■! I’l 1 


6 8 10 12 14 16 t8 

Adaptability scorss 

PiQ. 5-12,-^pot graph showing acceptoblo Adaptability Teat score range for 
piokler-leamens m a steel mill. 


those who score 18 or better are considered, approximately 60 
per cent received the high grades. The curve shows a consistent 
increase with successively higher minimum test scores. 

Figure 5-12 shows the test scores of thirty-eight applicants 
who were placed on the job of pickler-Iearner in a steel mill. 
Of these men who were tested at the time of hiring, seventeen, 
or 45 per cent, quit for one reason or another in less than two 
months, six were discharged for cause, and fifteen were on the 
job three months or more at the time of the study. The dots in 
Fig. 5-12 indicate the Adaptability Test scores made by the men 
in each group. As suggested earlier, upper and lower limits were 
set to establish a score range of 6 to 11. The figure shows that 
whereas twenty-three, or 60 per cent, of the entire group either 
quit or were discharged, 75 per cent, or fifteen, of the twenty of 
(jiose outside the range either were discharged or quit. Those 
who remained on the job three months or more were rated by 



PniNCIPLES OF PERSONNEL TESTING 


72 

their supervisors^ and in Fig. 5-12 those who were rated average 
or better are represented by solid dots while all others are repre- 
sented by open dots. Of those within the acceptable range, eight 
of tlie ten were rated above average but only one employee out- 
side the range received that rating. When the total picture is 
considered, it is evident that to get nine employees who would 



Adaplabiitty tail icon 


Fia Foracotaga of auporvisoi-s in four AdaptabUily Tetl score biuckets 
who were on the job six months Inter (solid bnra) and percentage who wore not 
(shaded bars). 


be on the job three months later and bo considered average or 
better the company had to hire thirty-eight men representing a 
ratio of 4.2 to 1. Whereas, had the acceptable range of 6 to 11 
been applied eight of the eighteen employed would have fallen 
in the category representing a ratio of 2.26 to 1. 

The importance of mental ability in one supervisory job is 
demonstrated by Fig. 5-13. Seventy men in a rubber plant were 
selected by usual means and were upgraded to supervisory posi- 
tions. After they were selected, each took the AdapiabUUy Test, 
but the test in no way figured in the selection process. Six 


PLACEMENT EECOMMENDATIONS FOB, VABIOT3B ADAPTABILirY TEST SCORE BRACEETTS 






74 PRINCIPLES OF PERSONNEL TESTINQ 

months later a study of the group was made, and it was dis- 
covered that many were no longer in the supervisory jobs. Some 
had quit or had been discharged, and some had been demoted 
to their previous jobs. All of those who wore originally tested 
were divided into four groups according to lest score, 0 to 4, 
6 to 9, 10 to 14, and 15 or over. As the figure shows, 100 per 
cent of those scoring in the 0 to 4 bracket were no longer on 
the job whereas only 5 per cent of those scoring 16 or more were 
no longer on the job. A systematic trend operates throughout 
tire four score groups. It is certainly evident that tire higher 
men score, the more likely they are to be supervisors in this 
plant six months later. 

The authors ' of the test have related broad job levd descrip- 
tions to test score ranges as a guide to the test user. These are 
reproduced in Table VI but are not intended to be a substitute 
for validation studies which should be carried on in each plant 
or business. Job titles are frequently misleading,' and the dif- 
ferences in job requirements ai'e so great that even good job 
descriptions do not always reveal them. 

SUMMARY 

Mental ability, which is most accurately defined as that abil- 
ity which the tests measure, is a combination of numerous 
more specific abilities. Employees who possess these abilities to 
a greater extent than others tend to be "quick on the trigger," to 
learn more easily, to be capable of versatility, and to be generally 
more competent. While performance on these tests refiects the 
interaction of heredity and environment, the proportional con- 
tribution of each is relatively unimportant in personnel place- 
ment. 

There is a desirable mental ability range for each job classi- 
fication and particularly for the lower level jobs, upper limit 
critical scores are just as important as lower limit scores. Valida- 
tion facts for three of the leading industrial tests ore presented. 

1 /bid. 



CHAPTER VI 


TEMPERAMENT AND PERSONALITY 

TESTS 


The previous chapter has dealt with mental ability and its 
relationship to job success. However^ there has been no intent 
to imply that the simple possession of the amount or degree of 
mental ability which a given job requires ensures success on 
that job. There are other aspects of personality that help to 
determine the degree of success, not the least important of which 
is good mental health plus the kind of temperamental pattern 
that is most easily adapted to the specific job in question. The 
present chapter deals with this particular phase of personality 
and its measurement. 

THE VATUHE OF XEUPERAMBNT 

Terminology. — ^Inadequate definition of terms has helped to 
confuse those who are interested in testing. For example, a 
common and generally accepted definition of personality is “the 
sum total of an individual's mental, emotional, or temperamental 
make-up.” In other words, an individual’s personality is his 
total being. This seems simple enough until one encounters a 
test called a “personality” test. In reality, any test measures 
some phase of personality, and the term temperament test is 
more meaningful, since such instruments do not measure one’s 
total personality but actually measure one aspect of it. 

What Temperament Is. — ^Psychologists have defined temperar 
ment in diffei-ent ways. The most useful definition for the per- 
sonnel or industrial relations man is the one that considers 
temperament as one's behavior tendency. Some people have a 
tendency to behave in a domineering fashion; some have argu- 
mentative tendencies; some are retiring; and so on. These 

7 & 



76 PRINCIPLES OF PERSONNEL TESTING 

tendencies can be thought of as descriptive of their tempera- 
ments. One person is described as being hot-headed; one as 
sensitive; and still another as being quite excitable. Theories 
have been advanced regarding body chemistry as a determinant 
of temperament, and studio have indicated certain relationships 
between glandular function and temperament. However, the 
personnel man can best take the same position regarding tem- 
perament as it has been suggested that he take relative to mental 
ability, namely, that how an individual got tliat way is a second- 
ary matter. From the point of view of personnel placement it 
is unimportant, important as it may be from the point of view 
of social and educational philosophy and clinical psychology. 
One’s temperament is his behavior tendency, his tendency to 
act m certain characteristic ways, not at any given instant but 
over extended periods of time. 

How Habits and Traits Are Grouped. — ^Each person in the 
process of growing, developing, or getting older acquires specific 
habits of action. Specific as these habits are when considered 
individually, they tend to group themselves together in clusters. 
So the individual who is characterised as being sliy and who fre- 
quently crosses the street to avoid meeting another is most prob- 
ably apt to retire to on inconspicuous comer at a party or, 
indeed, not to attend the party at all. Habits or traits of tliis 
kind tend to cluster together. Note that the word tend is used ; 
these traits are not always associated with each other. But 
habits and behavior traits are associated frequently enough so 
that psychiatrists and others speak of components of tempera- 
ment. Components of temperament are the larger, more general 
clusters or combinations of specific habits and behavior traits. 
For example, selfishness is a component of temperament. 
Actually, the word selfishness merely describes dozens or hun- 
dreds of specific habits or traits that an individual might possess. 
The more such habits and traits he has, the more selfish he is 
considered to be. Yet selfishness must be thought of only as a 
tendency that, although present in everyone, is more prevalent 
in some people than in others. 

As a general rule, an individual whose temperamental tend- 
encies are extreme or whose tendencies are in conflict to the 



temperament and personality tests 


77 

extent that tension, strain, or anxiety is characteristic is said 
to be maladjusted. His personality is not well integrated; and 
other things being equal, he will be less satisfactory as an em- 
ployee. 

The Components of Temperament. — ^There are various theo- 
ries of temperament, and each one recognizes or emphasizes 
various combinations of components. Humin’s adaptation ^ of 
Rosanoff’s theory ^ recognizes seven patterns of complexes or tend- 
encies. Although these tendencies are present in all personali- 
ties, it is the difference in degree or amount that is associated 
with differences in behavior. These seven components “ together 
with their prime tendencies and associated traits follow: 

Component Prime Tendency Some Aaeociated Tratlt 

Normal Self-control Nervous stability, sclf-improvementi conserve 

atlsm, self-directioa 

Hysteroid Selfishness Brivo toward advantage and profit, solf- 

preservatioD, egocentrlcity 

Mnnio Excitability Bnve, alcrtnnsa, chcoi fulness, sociability, wide 

interests 

Depressivo Depressian Sadness, wony, caution, retardation 

Autistic Daydreams Visual imageiy, shyness, secluaiveness, sensi- 

tivoncss 

Paranoid Fixed-ideas Egotism, durability of opinion, argumentative- 

ness, rationalization 

Epileptoid Fiojccirmaking Plnnfulncas, zncticulousnoss, inspiration to 

achievement 

Guilford while recognizing most of the Humm- Wadsworth 
components, fractionates some of them and adds others. His list 
follows: 

1 Humm, Dokcabter G. Personality and adjustment. /- Psychol., 1042, 13, 
100-134. 

a Rosanoft, a. S- Manual of psychiatry. (6th Ed) New York* Wiley, 1027. 
» Humm, Doncaster G. Personnel evalualion method. Los Angeles* Don- 
caster G. Humm Personnel jServicc, 1945. 

* The component descriptions presented here have been adapted from 
material on tlie Oudford-Martm temperament profile chart. Beverly Hills, Calif.: 
The Sheridan Supply Co, 



78 PRINCIPLES OF PERSONNEL TESTING 


Component 

Social intTovonioii'* 
extroversion 

Tiiinking inii-ovoraian- 
exiroverfflon 

Ebathymia 

Depreasion 

Cycloid disiiositioii 

Qenoral activity 

ABcendancc-submissioii 

Masculinity-femininity 

Inferiority feelings 

Nervousness 


High Degree 

Sociability, tendency to 
seek social contacts and 
to enjoy the company 
of otbera 

Lack of introspection and 
an extrovevUvo orion* 
tation to the thinking 
process 

Tendency to bo happy- 
go-lucky or carefree, 
lively and impulsive 

Freedom from depros- 
sion, cheerful, optiinls- 
tic 


Stable, omotional roao^ 
lions, freedom fmm 
cycloid tendencies 


Tendency to engage in 
vigorous ovorb action 


Social leadership 

Masoulinity of emotional 
and temperamental 
make-up 

Sclf-Goiifidonco 


Tendency to bo calm, un- 
ruffled, and relaxed 


Low Degree 

Shyness, tnndcnoy to 
withdraw from social 
situations and to bo 
seclusive 

Inclination to meditative 
thinking, philosophic 
ing, analysing one's 
self and others 

Inhibited disposition and 
ovorcontrol of impulses 


Chronically dopressod 
mood, including feel- 
ings of unwortliincsa 
and guilt 

Cycloid tendencies as 
shown in strong emo- 
tional reactions, ilua- 
tuation in mood, ten- 
demiy towanl flight- 
inesa and instability 

Tendency to inertness 
and a disinclination for 
motor activity 

Social posscfisivoness 

Femininity of emotional 
and temporumcntal 
make-up 

Lack of confldcncc, un- 
dervaluation of one's 
self, and feelings of 
inadequacy and in- 
feriority 

Jumpinoss, jitteriness, 
and tendency to bo 
easily distracted, irri- 
tated, and annoyed 



TEMPERAMENT AND PERSONALITY TESTS 


79 


Component 

High Degree 

Low Degree 

Objectivity 

Tendency to view one's 
self and surroundings 
dispassionately 

Tendency to be hyper^ 
sensitive and to take 
everything personally 

Cooperativeness 

Willingness to accept 
conditions witli toler- 
ant attitude 

Overcritloalness of people 
and things 

Agreeablcncsfl 

Lack of quarFclsomonesa 
and domineering qual- 
ities 

Belligcient and domi- 
neering attitude and 
overreadiness to fight 
over trifles 


While many temperament scales designed to measure one or 
more components have been developed, generally speaking the 
components measured appear in one or both of the above lists. 

THEORY OF TEUPEKAUENT MEASUBBUEHT 

The Basis of Measurement. — Although test makers and theo- 
rists often disagree in their concepts of temperament and in 
whether or not they accept or reject certain components, they 
do agree in one basic respect; that is, an individual's present and 
previous behavior is the best key to his future behavior. Con- 
sequently the questionnaire approach is utilized. Questions ore 
employed that are designed to detennine how a person says he 
has behaved in past situations or how he feels or thinks about 
specific situations now. Hence, to the extent that the questions 
selected tend to sample traits or habits identified with the com- 
ponent in question, a quantitative statement of that tendency is 
obtained. 

How Items Are Selected. — ^In the preparation of temperament 
scales, authors first prepare a compr^ensive list of questions in- 
tended to measure the component in question. The questions 
included are selected in terms of cUnical experience or of a 
logical analysis of the component. These questions then receive 
a tryout with criterion groups. The groups may be composed of 
clinically chosen individuals; one of them may be composed of 
institutionalized oases; or they may be made up of persons who 



PRINCIPLES OF PERSONNEL TESTING 


80 

have been judged by supervisors or others to be extreme insofar 
as the tendency is concerned. An item analysis is then made to 
determine what proportion of the high-criterion group and what 
proportion of the low-criterion group gave a particular response 
to the question. To the extent that the question has been an- 
swered differently by the two groups, the item is said to dis- 
criminate and is retained as a good item; to the extent that both 
groups tend to give the same answer, the item is rejected. As a 
result of this procedure,* the best items are retained for inclusion 
in the final form of the scale. Autliors of those scales have 
pointed out that rai’ely can one predict with absolute certainty 
how a given item will be responded to or how well it will dis- 
criminate. In an effort to mask or withhold their true attitudes 
or feelings, individuals talcing the tests sometimes give answers 
opposite from those expected, which answers nevertheless may 
be equally indicative in terms of the statistics. 

Kinds of Tests. — ^Temperament tests ore occasionally designed 
to measure one component. Allport’s Ascendance-submission 
Test is an example. More frequently, however, the scales are 
multiphasic in the sense that they ore designed to measure 
several components of temperament at once. In the latter cose, 
a profile * is generally used to interpret the test for a single in- 
dividual. Frequently, one’s score on a given component is not 
so important as is that score in rdiation to his score on one or 
more other components. 

Accuracy of Report. — Questionnaires of tlie "yes or no” 
variety have been ci’iticized because the person taking the test 
is at liberty to answer either way in spite of how he really feels 
at the time or how he acts under certain circumstances. It is 
further alleged tliat not all individuals are capable of reporting 
accurately, even tlibugh their intent is good. Humin has at- 
tacked these criticisms in two ways, first by providing an indus- 
trial standardization of his teat; in other words, his norms have 
been established on persons who were applying for a job and all 
of whom found it to their advantage to present what seemed to 

X Bca Chap. Xin, pp, 183-188. 

X Sco Chap. IV, p. 61. 



TEMPERAMENT AND PRISON ALITY TESTS gj 

be the best picture of themselves. His second approach is 
through an evaluation of the number of "no" answers given 
and an analysis of the balance among the various components. 
Persons who present inaccurate pictures of themselves tend to 
“underreport” or “overreport.” An extremely negativistic per- 
son accumulates more than the expected number of no’s, while 
the extremely suggestible person tends to accumulate fewer 
than the expected number of no’s. On the basis of his research 
he first established an acceptable range for the “no” count, 
and any record was considered acceptable or nonacceptable 
in terms of whether or not it met this criterion. He has since 
developed techniques for correcting scores for records having 
unacceptable “no” counts so that practically no records are now 
eliminated for this reason. 

In terms of practical application, however, these are theoreti- 
cal questions to be left to the test maker and the research man. 
If a test discriminates between successful or unsuccessful em- 
ployees by some techniques such as those outlined in Chap. II, 
then whether applicants reported accurately or not becomes a 
matter of secondary importance. 

SPECIFIC EXPERIENCES WITH TEHPERAHBNX TESTS 

The Humm-Wadsworth Temperament Scale. — ^The Humm- 
Wadsworth * is one of the more widely used in industry of all 
temperament tests. It consists of over three hundred questions 
of the “yes or no” variety and is scored in terms of the seven com- 
ponents listed earlier. The scale was originally validated by ad- 
ministering the initial list of questions to seven pairs of criterion 
groups representing the two extremes of each of the components. 
As a result, the discriminating questions were identified; and al- 
though a testee still answers all the questions, only about half of 
them are scored. For each person taking the test a profile is pro- 
vided, and his performance is interpreted in terms of his total 
picture on the components listed on page 77. 

An example of the applicability of the scale has been provided 

^ HuMif, Doncaster G., and Wadsworth, Got W., Jr. The Humm-Wads- 
worth temperament scale. Atner. J, Ps^ckiaL, 193S, 92, 163-200. 



82 PRINCIPLES OP PERSONNEL TESTING 

by Dorcus* who administered the scale to fifty industrial em- 
ployees, divided into two criterion groups, not on the basis of 
production but on the ''basis of tlie fact that they showed signs 
of maladjustment and discontent or were problem employees.” 
Without knowledgo of how the fifty employees were classified, he 
designated each individual as satisfactory or unsatisfactory on 
the basis of the test profile. Figure 6-1 sliows the results 
obtained. Of those classified os satisfactory on the basis of the 
test, 75 per cent had been previously rated as satisfactory em- 


Satisfactory on test 


UnsQtisfoetory 6* 
on test 



Per cent rated Per cent roted 
unsotiafactory eatlefactory 


75% 


Fla. 6-l/--ProporfionH of tlioso considered satisfactory and unsatisfactory on 
tho Kumm test who were rated satisfactory and unsatisfactory l^y supervisors. 
(Flroin Dorcits,) 


ployecs whereas only 25 per cent had boon considered as unsatis- 
factory. Of those classified as unsatisfactory on tho basis of the 
test, 67 per cent had previously been considered as unsatisfactory 
whereas only 33 per cent had been rated satisfactory. In other 
words, Dorcus, using the test, properly identified 75 per cent of 
the satisfactory employees and 67 per cent of tlie unsatisfactory 
ones. Humm* has criticized this study on two counts. First 
he says that standardized test conditions were not observed and 
that the employees in this instance were reassured too much that 
"tho test in no way will affect your status with the company.” 
This, he says, has resulted in relieving tho testees of certain 
tensions and fears, which caused them to respond differently 
from the case with the standardizing groups, lie maintains that 
special training is necessary and that had tho proper procedures 
and interpretations been utilized the results would have been 

^ Dorous, Hot M. A brief study of the Iluinm-Wiulswoiih tomporomont 
Bcalo and the Guilford-Martin personnel inventory in nn industrial situation. 
J. appl. PtyehoL, 1044, 28, 302-307. 

^ Humm, Doncabtisr Q. Disoiiiaion of Dorcus’ study of tho Ilumni-Wads* 
worth temperament scale. J. appl, Payehol., 1044, 28, 627-620. 



TEMPERAMENT AND PEB^NALITY TESTS 88 

in greater agreement. Hunmi also expresses his opinion that 
this is not a validity study. Ho says that it is a study of the 
applicability of the scale to an industrial situation. Validity, 
he implies, is determined by how well the scale identifies those 
who possess strong or weak tendencies insofar as the various 
components are concerned. Humm ^ has conducted such studies 
and has reported coefficients of correlation in the high 90’s 
between profile interpretations and clinical and case history 
studies. 

Gviilford’s Inventories. — ^Three different scales ore used to 
measure Guilford's thirteen components listed on pages 78 and 
79, These are his Inventory of Factors STDCR, intended to 
measure social introversion-extroversion, thinking introversion- 
extroversion, rhathymia, depression, and cycloid depression; the 
Inventory of Factors GAMIN, designed to measure general activ- 
ity, ascendance-submission, masculinity-femininity, inferiority 
of feelings, and nervousness; and the Personnel Inventory I, in- 
tended to measure objectivity, cooperativeness, and agreeable- 
ness. Of the three, the last is receiving greatest acceptance in 
industry. According to Martin * it measures three subfactors of 
the paranoid component and is most useful in identifying the 
potential troublemaker. He presents evidence to support his 
statement from two industrial situations. 

Fifty-one employees in an aircraft-parts manufacturing plant 
who were rated by supervision as satisfactory or unsatisfactory 
in terms of adjustment were given the scale. In another study 
forty-three textile mill employees were rated and tested in the 
same fashion. In each instance the test properly placed between 
70 and 75 per cent of the employees. 

Dorcus * used the Personnel Inventory I at the same time he 
used the Humm-Wadsworth Temperament Scale, and his find- 
ings are presented in Fig. 6-2. Of those classified as satisfactory 
by the test, 73 per cent were rated satisfactory; and of those 

^ Humm, Doncasteb G*, and Humbc, K A Validity of the Humm-Wada- 
worth Tomperamcnt Scale. J. PaychoL, 1044, IB, 55->64. 

^ Martin, Howard G. Locating the troublemaker with the Guilford-Martin 
Personnel Inventory. /. appl , PsyohoL, 1944, 28, 461-467. 

B Dobqub, op ciL 



84 PRINCIPLES OF PERSONNEL TESTING 

classified as unsatisfactory by the test, 68 per cent were rated 
unsatisfactory. 

While the authors of the scales believe that the cooperative- 
ness component is the most important of the three, they believe 
that all three are important. Guilfortl^ cites one study in 
which a group of employees were classified as being above 
average on two or three of the components or below average on 
two or three of the components. His results are presented in 


SoHifoctory on l«it 27% 
68 % 

UnBottofactory 
on test 


Per cent rated 
unsotlBtactory 


1 75% 


P«r cent rated 
eatisfactarv 


Fiq. Q-2r— Proportions of those considered setisraotory utid unBotisfactory on 
the Penonnel Inventory I who wcio rated sftiisfoctory and imsatisfactory by bu- 
pcrvisora* {from Dorcue,) 


Mg. 6-3. Of those rated ns being satisfactorily adjusted on the 
job, 66 per cent wore nbovo average on two or three of the com- 
ponents; of tltose rated as being adjusted in an unsatisfactory 
manner, 73 per cent were below average on two or three of the 
components. 

Although the inventory is relatively new, tho authors have 
collected enou^ tryout data* to make tho following claims 
for it: 

1. Approximately 85 per cent of the employees rated as mal- 
contents by management are found to have received a raw 
score below 60 on the cooperativeness trait. 

2. A number of executives who have taken the test have been 
found to be well above ihe average on both objectivity and 
cooperativencss but somewhat below average on agreeable- 
ness. In other words, the lack of dominating qualities in 
an employee is an important factor in making him a con- 
tented, peaceable worker; but in the case of an executive. 

* Manual of direettOM and norms. Beverly Hilla, Cniif,; Sheridan Supply Go. 

* Sonic in/ortuattoA cbom (he Guil/ord-JMarlw Personnel /nvenlorp I. Bev- 
erly Htlle, Calif.: Sheridan Supply Co. (Mimeo.) 



TEMPERAMENT AND PERSONALITY TESTS 86 

dominating qualities may be an important contributing fac- 
tor to his supervisory success. 

3. We have found some few cases where the testee fails to 
answer questions truthfully. This seems to be the excep- 
tion rather than the rule and to occur most often if they 
do not answer the questions as they believe management 
wants them to answer. The more intelligent the person is, 
the better can he falsify consistently if he is so inclined. 

Rotad aolitfoelory 
Ralad untoMafoclory 

Fia. e-3j— FroporUona of thoae employees rated satisfactoiy and unaatisfac- 
toiy who were below average or above average on two or three of tlio components 
in the Personnel Inventory I. 

Experience with Other Scales. — As stated earlier, there ore 
many scales and inventories designed to measure one or more 
temperament components. The use of many of these has been 
confined to the psychological clinic and the laboratory, and re- 
ports of their use in business and industry have been scanty. 
Bernreuter’s PeraoncAity Inventory has been widely used in the 
selection of salesmen. Much of this work has been done by con- 
sulting companies and reports have not been available in pub- 
lished form. Bosenstein ^ reports that 'Tour factors described 
and considered by this questionnaire have been found of value 
in the selection of salesmen, extroversion, dominance, self-confi- 
dence, and social independence or self-sufficiency.” Chapter XI 
presents some facts relative to the value of the Bernreuter in 
combination with other measuring devices. 

Schultz * administered the Beckman Revision of the A-8 Teat 
and the Root Introversion-extroversion Test to a group of 259 

^ Rosrnbtein, J. L, The ecientifio selection of saleemen. New York* 
McGraw-llill, 1944, 162 pp. 

> Schultz, Hichard S, Standardiced tests and statiBlical prooediires in seleo* 
tion of life insuraaoe sales peFsonnel, /. appl, Psychol, 1930| 20« 663^66. 




86 PR]NaiPLB8 OP PERSONNEL TESTING 

insurance salesmen. His results presented in Pig. 11-1 and dis- 
cussed in Chap. XI are pertinent here. Among the poorest pro- 
ducers, 53 per cent were considered unacceptable on the test; 
among the best producers, only 32 per cent were considered un- 
acceptable on the test. 

One of the newer scales and one tliat is based upon a slightly 
different approach is Jurgensen's Classification Inventory. The 
scale differs in two important respects. (1) Using a new type 
of item, the author has endeavored to include only those which 
research has indicated are at least subject to falsification by the 
testee, and (2) he recommends the scoring of the scale by 
occupations rather than by components. This approach is some- 
what revolutionary, and little validity information is at present 
available. However, he does report a coefficient of correlation 
of 0.80 between test scores and criterion measures in the case of 
forty salesmen. Further research will be needed to establish the 
validity of the scale. 


SUMMARY 

While the term temperament is not universally defined, it is 
used here to denote behavior tendency, that is, the tendency of 
an individual to behave in characteristic ways. Behavior acts 
tend to cluster into patterns usually called components. Some 
temperament scales are designed to measure a single component, 
but most scales are intended to measure several. The problem 
of intentional and unintentional falsification on inventory types 
of scales is a real one but is relatively unimportant when the 
same validation procedures discussed earlier are employed. 



CHAFTBR VII 

INTEREST AND PREFERENCE TESTS 


Some people enjoy human coiitaets; Bome enjoy manipulative 
activities; and still others prefer to engage in verbal activities. 
What an individual likes, in addition to his mental ability and 
his temperament pattern, is one factor associated with occupa- 
tional adjustment and success. This chapter deals with the 
measurement of interests and preferences. 

TH£ UBASUREUENX OF INTERESTS 

Types of Items. — ^Interest and preference testa utilise a num- 
ber of different types of items or questions, but a discussion of 
two major ones will suffice as illustrations. The preference item 
is one containing two or more responses. Each response repre- 
sents an activity or an object, and the individual is asked to 
indicate the one he likes most, sometimes the one he likes least, 
sometimes the one he likes second best, and so on. The respond- 
ent must make a choice between activities or objects. In the 
other type of item, he is given the name of an activity or an 
object and is asked to state whether he likes it, dislikes it, or is 
indifferent to it. Both types of items have produced excellent 
results, and no effort is made here to demonstrate the superiority 
of one over the other. 

Basic Scoring System. — ^There are two basic scoring systems 
presently in use, each of which has a counterpart in temperament 
testing. The first is the scoring of the test by specific occupa- 
tions, in which a different key is prepared for each occupation. 
Thus, answering "dislike" to a particular item might contribute 
to a high score on the lawyer key and to a low score on the life- 
insurance-salesman key. Strong has standardized his Vocational 
Interest Blank for Men by this method, and at present there are 
available twenty-seven different occupational scoring keys. 

87 




88 PRINCIPLES OF PERSONNEL TESTINO 

The other approach utilizes the component approach. Thus, 
the items indicating interest in mechanical activities are grouped 
togetheri those indicative of scientific interest are grouped to- 
gether, and so on. The applicant receives several different scores 
on the same test, one for each of the components measured by 
it. For each person, a profile is prepared and an evaluation of 
his relative interests in different fields can be determined. For 
example, he may be in the upper 10 per cent of the general popu- 
lation in the strength of his mechanical interest, whereas he may 
be in the lower fourth of the same population insofar as interest 
in literary activities is concerned. This approach is used in the 
Kuder Preference Record. Both approaches have their advan- 
tages and their disadvantages in terms of the specificity of their 
resulis. 

How Items Are Validated. — Items in tests or scales of this 
character are validated in a fashion similar to the procedure dis- 
cussed earlier in conjunction witli temperament testing. This is 
particularly true in the preparation of occupational ke3rs. In 
the process of preparing a key for the job of accountant, for 
example, the scale is administered to a large group of accountants 
who have been engaged in Uiat occupation for a minimum stipu- 
lated period of time. Item counts are made, and the proportion 
of accountants responding in a given way to a specific question 
is compared with the proportion of nonaccountants or ^e pro- 
portion of a general population responding in the some way. To 
the extent that accountants tend to answer the item in the same 
fashion as the other group, the item is given a weight of zero for 
scoring interest in that partieviar occupation; it may be quite 
important in anothei* occupation. Likewise, to the extent that 
accountants appear unique in that they tend to answer dif- 
ferently than the mine-run of people, the item is given a weight 
or value. The range of these values varies with the several tests, 
but the greater the difference, that is, the more unique accoun- 
tants are, the greater the weight. Once keys are prepared, one 
simply adds the various weights assigned to the responses given 
by a single individual to obtain a total score. 

The validation of items for inclusion in a component or in- 



INTEREST AND PREFERENCE TESTS 8B 

terest area is a bit more complex. Generally speakingj the pro- 
cedure is as follows. All of the items that logioally appear to 
measure a given area, mechanical, for example, are identified, 
and the test papers of a trial group are scored on these items 
only. Two subgroups are then chosen, one composed of perhaps 
the 25 per cent making the highest total scores and another 
composed of a similar percentage making the lowest scores. 
These are then known as the ciiterion groups, and item counts 
are made for each group. The proportion of the “high” group 
and the proportion of the “low” group responding in a given 
way on a specific question are compared. To the extent the 
proportions are the same, the item is dropped from that com- 
ponent; and to the extent they are different, it is retained. This 
process which is said to employ the criterion of internal con- 
sistency is repeated with each component until each item is 
properly assigned to one or more of the areas. 

OCCUPATIONAL GROUP DIFFERENCES 

While Sorbin and Anderson have presented some evidence 
regarding the relationship between expressed job dissatisfaction 
and measured interests, scales have been most extensively vali- 
dated by means of occupational groups. Strong has shown sig- 
nificant differences in the scores obtained by persons in various 
occupations. Doctors, for example, con be expected to score 
higher on the doctor key than on any other key; as implied 
earlier, this is inherent in the construction of the scale. Kuder, 
using nine components on the Kvder Preference Record, has 
demonstrated significant differences in the average patterns of 
various occupational groups. iFigure 7-1, for example, is the 
group profile for twenty-seven employed chemists. Note that 
in terms of expressed interest in scientific activities, the group 
average in the upper 10 per cent of the general population and 
that they average in the lowest 10 per cent in their interests in 
clerical activities. Figure 7-2 is the group profile for twenty- 

^ Sabbin, Thbodobii B., and ANDBDaoN, Hrdwin C. a pieliminoiy etu^y of fhe 
relation of ineaaured interest patterns and occupational dissatisfaetion. Edito. 
Ptyeh. Meaa^ 1042, 2, 23-2S. 



go PRWCIPLEB op PERSONNEL TESTING 

seven male accountants. Note that their expressed interests are 
quite high in computational and clerical kinds of activities and 



Fio. 7-1^-^roup profile based upon mean performance of iwcuty-soven chern* 
iats on Kuiet Prejerence Records (From Kxidar*) 



Fra. 7-2d--<5roup profile baaed upon mean performanoo of twenty-sovon male 
aooountontfl on Kader Pr^feronce Record, (From Kvder,) 

that when compared with the general population, they are lowest 
in their mechanical interests. These two patterns prepared from 
data selected from Kuder’s manual ^ are illustrative of a large 

» Int$rmedialo tiumutd for Iht Ruder f/refermue record. CliicAgo: Soienoe 
Beeearoli Ansocifttcs, 1944, 



INTEREST AND PREFERENCE TESTS 


01 

number that he presents. Generally speaking these group pro- 
files correspond to expected patterns arrived at by logical analy- 
sis of the occupation. 

ABILITY GROUP PATTERNS 

Method Crildcized. — ^The occupational group method for 
validating interest tests has received some criticism just as it 
has in connection with mental abilily and temperament tests. 
What evidence is there, say the critics, that one is worse off in 



Fio. 7-3^ — ^Proportion of Ihirty-fiix advortitimg men who received A (solid 
bara)f B (open barB)j and G (shaded bars) grades on Strong’s Vocaiionnl Inter- 
est Bhnh ^ when grouped according to ratings. 

an occupation just because his interests do not coincide with 
those of the typical employee in that occupation? Strong has 
answered these criticisms very effectively with considerable re- 
search, only part of which will be reported here. 

One study ^ involved thirty-six advertising men who were 
rated as to their occupational competence by three independent 
judges and their ratings were combined. They were scored on 
the "advertising man” key of Strong’s Vocational Interest Blank 
for Men and given grades of A, B, or C. ITigure 7-3 shows the 
percentage of each rating group that received each test grade. 
Note that C grades were reedved only by the poorest and next 
poorest group and that 100 per cent of the poorest group received 
C grades. Neither were there any A’s in these two bottom 
groups. The consistent increase in number of A’s with succes- 

1 SmoNa, Edwaiid K., Jb. Voc<Uional inleretts of men and vmnen. Stanford 
Univenity, Calif.*. Btanfoid Univemly Freaa, IMS, p. 601. 



92 PlillfaiPlES OP PERSONNEL TESTING 

sively higher rated groups is apparent. Strong ^ also has com- 
pared the scores of 288 insurance agents, eacli selling in excess 
of S150,000 annually, with the scores of twenty-seven individuals 
who failed as insurance salesmen. Figure 7-4 sliows the results. 
Among tire successful group 75 per cent scored A, 24 per cent 


Agents averaging $l00|000 
onnuoMy (N«286) 

32% 

Fgllurn(N>e7) 

ITia. 7-4.~Proportioiis of erouiis of siivccsBrul nnci unmiOOCBaful insuraiico eales- 
men who received grndcii of A (solid bats), B (open bars)> tmd 0 (shaded ban), 
on Strong’s Voeotfonef Inltircai Blank, {From Strong.^) 



Par cant w1 th voluma Per cenLwith volume 

«ndar'l50,000 over^SCkOOO 

Fra. 7-5.*— Proportions of groups of jnsurnneo salesmen receiving various grades 
on Strong’s Vocalum<d Interest Blank for men who exceeded a sales volume of 
$160,000. {From Strong fi) 



scored B, and only 1 per cent scored C, whereas among the 
failures 32 per cent scored C, 46 per cent scored B, and only 22 
per cent scored A. 

In another study involving 211 insurance men, Strong* 
divided the group into two subgroups, those with an annual sales 
volume below $150,000 and those with a volume in excess of that 
amount. Figure 7-6 sliows the relative proportions of these two 
subgroups at the various test levels. For example, of those re- 
ceiving A on the test, 66 per cent were in the high producing 
group and 44 per cent were in the low producing group. In the 
same fashion, of those scoring G 94 per cent were low producers 

1 /bid., p. 480. 

S /bid., p. 402. 


INTEREST AND PREFERENCE TESTS 93 

and only 6 per cent were high producers. The general trend, 
though somewhat irregular, is nevertheless marked, and there is 
no question but that superior scores on the insurance key are 
associated with superior sales performance. 

Bills* has reported a similar study involving casualty- 



Fid. 7-6.— PropoitioDfl of groups of casualty insurance salesmen seorinB at vari- 
ous levels on Strong's Vocaltonal Interest Blank for men who were rated eucoess- 
ful and failures. (From Bills.) 

insurance salesmen. She investigated £88 salesmen who were 
rated by their respective managers as failure or successful. She 
scored each man with the life-insurance-salesman, key and the 
real-estate-salesman key and worked out a combined test score 
which is represented in Fig. 7-6 by the letters A to E. The figure 
shows that of those receiving a combined score of A, 78 per cent 
were successful and 22 per cent were failures whereas of those 
scoring E only 24 per cent were successful and 76 per cent were 
failures. 

Ryan and Johnson ‘ conducted similar studies with a group of 
men engaged in selling machine accounting methods and another 
group of accounting machine service or repair men. Figure 7-7 
presents their findings for a second group of salesmen after they 
had previously determined scoring weights with a preliminary 
group. These salesmen were grouped in terms of job perform- 

^ Bills, Marion A. KclatJon of scores in Strong’s interest analysis blanks to 
success in Bolling casualty insurance. J. appl. Psf/choL, 1038, 22, 07-104, 

> Ktan, T. a., anu Johnson, Bbatricb B. Interest scores in the selection of 
salesmen and servicemen: Occupational vs. ability-group scoring keys. /. appL 
Ptyckol, 1042, 26, 643-502. 



94 PRINCIPLES OF PERSONNEL TESTING 

anoe, and the figure shows the percentage of each above and 
below a critical score. Figure 7-8 shows their findings with the 
accounting-machine servicemen. The same method of presents^ 
tion is employed, and the same general trend is prevalent. 


Bast on job 
Middle 

Poorest on job 


29 %! 

31 % 


1 


my/A 


|ri% 

69 % 


60 % 




Por cent below 
critlcQt score 


■■ 40 % 

Per cent above 
critical score 


Fio. T-T^Froportions of threo ability groups of machine accounting methods 
salesmen who scored above and below a critical score on Strong's Vocaltonal In-> 
terest Blank Jor AI efi. (Fiom Byan and Johnson,) 


Best on job 
Middle group 
Poorsti on job 



Per cent babw 
orltlcol score 


Per cent above 
orifloal score 


Fia. 7-^ — Froporlions of threo ability groups of accounting machine servico* 
men who scored above and below a critical score on Strong's Vocalional Inlerosi 
Blank /or (From liyan arid Johnson,) 


LIMITATIONS OF INTEREST TESTS 

Guidance Situations. — ^Interest tests are probably most useful 
in guidance kinds of situations where the individual taking the 
test is seeking direction and where all of the influences favor 
accurate reporting on his part. There is no guestion but that 
individuals can falsify their reports and that to the extent that 
they are intelligent and have insight into the nature and de^ 
mands of jobs they can influence their scores in the desired 
direction. 

However, two facts should be kept in mind. There are many 
business and industrial situations where placement or direction 
after the individual has been hired is the real problem. The 
experience of one company that hires a large number of graduate 



INTEREST AND PREFERENCE TESTS 95 

engineers is illustrative. Having hired these young men and 
placed them on the pay roll, the company finds it advantageous 
to administer the Kuder Prejerence Record to them to asfflst in 
directing them into soles work, supervision, research, etc. Here 
it is to tiie applicant’s best advantage to report accurately. 

The second point is one that was also made in connection with 
temperament testing. Provided tlie test is always administered 
in the application situation and provided the test is always vali- 
dated along the lines outlined in Chap. 11, the personnel or 
employment man cannot go wrong. When this factual approach 
is used, and if significant differences between acceptable and 
unacceptable are obtained, how much tlic applicant falsified is 
secondary. The criteiion is “Does the test work?” 

SUMHAKY 

Interests and preferences are factors in vocational adjustment 
and job success along with mental ability and temperament. 
Interest testa or scales are usually one of two types: Either they 
ore scored several times for different occupations, or the scale 
is scored by interest areas or components and each individual 
is given a profile. Both have been effectively used. Scales have 
been validated both by means of occupational groups and by 
means of ability groups. 



CHAPTER VIII 

VISUAL SKILL TESTS 


Few abilities are as almost universally necessary to the per- 
formance of jobs as the ability to see. Although there are jobs 
that can be done well by blind persons, the vast majority of jobs 
require that the worker be able to see, even though the visual 
demands of jobs vary. This chapter deals with vision as it is 
related to job performance. 

THS ITATIIBS; OF VISION 

Complexity of Vision.— Personnel managers and others re- 
sponsible for the selection of properly qualified employees have 
long recognized the importance of vision in conneotion with job 
performance as is attested by the frequency with whioli measures 
of acuity or keenness of vision have been used. Usually the test 
has consisted of a letter chart placed about twenty feet away, 
and more times than not this has been the only test used. The 
implication is twofold: (1) Acuity or ability to perceive detail 
is all that there is to seeing, and (2) one's ability to perceive 
detail at a given distance is an accurate indicator of his ability 
to perceive detail at other distances. Actually neither of these 
assumptions is completely true. Figure 2-2 on page 12 is evi- 
dence. Employees who did the best job of looping actually had 
poorer distance acuity than did the poor loopers. Seeing is an 
extremely complex act. Some professional eye men recognize as 
many as twenty different visual skills; ^ but whether one accepts 
the total list or not, research has definitely demonstrated that 
multiple skills exist and that many of them are relatively un- 
related. 

Visual Skills and Postures.— The separation of visual skills 
and visual postures is largely a matter of definition. Generally 

*■ SuBPAffl), C. p. Vtoual skills. OptornoMc Weekly, 1844, 34 (61), 1466-1406. 

90 


VISUAL SKILL TESTS 


97 

speaking, however, the skills include acuity (or keenness of 
vision) at various distances, color vision, and stereopsis, a factor in 
depth perception. Postures included the phorias, lateral and ver- 
tical. Measures of vertical phoria indicate the extent to which the 
two eyes tend to seek a normally level position in contrast to a 



Fra, 8-1.— Balationflhip between distance acuity and the job nerformanoe of 
225 radio-tube assemblora, (From Tiffin and TF»rt.) 

position in which one eye tends to assume a position higher or 
lower than the other. Measures of lateral phoria indicate the ex- 
tent to which the eyes tend to converge or diverge in the lateral 
plane. 

Relationships between Characteristics. — Geise* has demon- 
strated only a slight relationship between an individual’s ability 
to discriminate detail at various distances. For example, if one 
knows how well an individual can see at twenty feet, he can 
estimate how well that person can see at fifteen inches but with 
a degree of accuracy little better than chance. Geise’s findings 
support previous studies, the results of one of which are pre- 
sented in Figs. 8-1 and 8-2. Two hundred twenty-five radio- 
tube assemblers* were rated as A or B operators and were also 
given a standard battery of vision tests. Figure 8-1 shows the 

‘ Gbibb, W. J. The inter-rolationahip of visual acuity at different distances. 

J, appl, Pst/ehol^ 1946, 80, 91-106. ... . . 

* Tiffin, Jobbpii, and Wibt, S. E. DetormininB visual Btandajda for induatnal 
jobs by HtfltriBticHJ methods. Trane. Amer. Acad. Ophlkal. and Otolar,, Novem- 
ber-Dcccmber, 1946, 4-26. 



98 PRINCIPLES OP PERSONNEL TESTING 

percentage of A operators found among those scoring at various 
levels on a far acuity test (approximately twenty-six feet). 
Although the curve is somewhat irregular, it can be seen that 
there are no more A operators among those with the best dis- 
tance acuity than there are among those operators with poorer 
distance acuity. If there is a trend, it is in the direction of fewer 
A operators among those with better distance acuity. Figure 
8-2, however, tells quite a different story. The percentage of A 



Fia. 8'2.~-Bclatioii8hip between near acuity and job performance of 226 radio- 
tube aeeemblen. (From Tiffin and Wirt.) 

operators increases from about 25 per cent among those who 
score at the lower end of the near acuity test to approximately 
85 per cent among those who scored highest on the same test. 
These studies along with many others support the conclusion 
that no single vision test will yield all of the information that is 
needed for adequate job placement; a battery of vision tests is 
necessary. 

Changes with Age.— Visual skills change with age. Figure 
8-3 based upon 7,332 employees in a steel mill shows ilie per- 
centage of those employees of various ages who passed a far 
acuity test without the aid of spectacles. Note that the curve is 
reasonably flat until age forty, at which time it begins to drop 
systematically. Figure 8-4 indicates a similar trend in unaided 
near acuity. Color vision similarly deteriorates witih age. 



35 40 45 SO 55 60 65 

Age 

Fio. 8-4.— The perceataEO of pcraonB nt various age levels who were able to 
pass a certain near acuity test without glassea 

Figure 8-6 from Tiffin and Kuhn^ shows a decrease of from 
approximately 70 per cent among twenty-five-year-olds to ap- 
proximately 25 per cent among sixty-five-year-olds in the per- 
centage of employees passing a red-green color discrimination 

^ TnmH, JoeiPH, and Kuhn, Hmwia 8. Color discrimmatioa in indusby. 
Ateh. Ophttnl, 861-8S9. 


100 PRINCIPLES OF PERSONNEL TESTING 

test. Not all visual skills, however, decline with a^e in this same 
systematic fashion. Figure 8-6 shows the changes that take 
place in stereopsis with age changes. The percentage who could 
pass a given test increased from about 75 per cent at age twenty- 



Pia. 8-6.— The percentage of persona at various age levels who were able to 
pass a rod-greea color discrimination teat (From Txffm and Kuhn,) 



Fio. 8-6. — The percentage of persons at various age levels who wore able to 
pass a certain depUi-perception tcet (i\f=8,4l2), 

two to about 85 per cent at age thirty-two, remained reasonably 
constant to about age forty-eight, and then declined. At age 
sixty-eight, less than 65 per cent of these employees could pass 
the test. It can readily be seen that to the extent visual skills 
affected by age are related to success on a particular job, these 
age shifts are extremely important. Fortunately, vision may be 
aided through professional eye care. Although there are changes 



VISUAL SKILL TESTS 


101 

and deterioration in visual skills with age, adequate professional 
eye core can to a considerable degree keep visual performance 
somewhere nearly constant. To the extent that job demands are 
considered in the giving of professional attention, improvement 
on the job can result. 

Changes through Eye Care. — ^Kephart * has shown how im- 
proved vision through eye care may result in improvement on 
the job. In a study of thirty-one pairers in a hosiery mill who 
were on the job for a year during which the study extended, 
seventeen received professional eye care through regular com- 
munity channels and fourteen did not visit an eye doctor during 


Eya core 



ISOt 


No aya core 

Fia. a7^TI)c averngo cents per hour increase in earnings of seventeen em- 
ployees who received professional eye care and of fourteen who did not. (From 
Kopharl.) 


the year. Since employees on this job are paid on a piecework 
basis, a comparison of earnings accurately reflects shifts in pro- 
duction. The average earnings of all thirty-one employees in- 
creased during the year period; but as is indicated in Fig. 8-7 
from Kephart’s data, those who received professional eye care 
were averaging 15 cents per hour more at the end of the year as 
compared with an average increase of only 6.3 cents for those 
who did not seek eye core in the course of the year. 


MEASUKBMENT OE VISUAL CHABACTEBISTICS 

CUntcal and NoncUnical Tests. — In basic philosophy, differ- 
ences exist between clinical eye tests and nonclinical visual 
performance tests primarily in regard to test construction, test 
administration, and test application. Optometrists and ophthal- 
mologists use clinical tests to help in diagnosing physiological, 
pathological, and functional conditions and as aids in determin- 
ing proper corrective procedures and prescriptions. Such tests 
usually ore flexible to meet the various approaches of the doctor 

1 KsPHARTy Nbwsli* C. An analysis of eye care and industrial efficiency. 
2^rans» Amer, Acad, OphihaL and Otolar,, March-April, 1040, 1-6. 


102 pniNCIPLBS OF PERSONNEl TESTING 

and his clinicians in order to reach the needs of each patient. 
On the other hand, nonclinicol visual performance tests, con- 
structed on a sound psycliological basis, sliould be administered 
according to an unvarying standard procedure. For this reason, 
the demands of administering visual skill tests according to 
standard practice are repetitive and often tedious. Sucli pro- 
cedure does not permit opinionated expression of the administrar 
tor or the interpretation of any test scores. It follows that 
thorough training in noncUnical test administration according to 
standard practice is essential. Only a few, if any, optometrists 
or ophthalmologists have felt any inclination to do nonclinical 
testing themselves, the clinical procedures appealing to them 
because of their temperament and training. Because of standard 
practice, trained laymen well adapted for this job make success- 
ful operators. The training of the layman for the nonclinical 
test administration ensures professiontd eye men that diagnosis, 
prescription, and treatment are the absolute functions of the 
optometrists or ophthalmologists and con neither effectively nor 
legally be performed by other. 

The Snellen Chart.^The Snellen chart with its characteristic 
large E is well known to everyone. It is employed by the 
ophthalmologist and the optometrist as an effective instrument 
for diagnosis and is widely used as a nonclinical skill test for pur- 
poses of recording visual acuity at twenty feet. Althbugh often 
used as an employment test, its use is limited because it usually 
measures acuity at one distance only, it measures ability to read 
letters which places illiterates and literates on different bases, 
and its results ore subj^t to external factors such as quality 
and quantity of light. A battery of tests sufficiently varied to 
inventory those visual skills most generally important in job 
success is needed. 

Non-clinical Tests.— Three different vision testing instru- 
ments, nonclinical in nature and designed for classification pur- 
poses, are at present in use in industry. These are all binocular 
testing instruments ^ and are colled the T^ebmoeular, the Sight 

^ Mitkoda of tettmg and proleoHnQ eyeaighl in indtulrtf. Industrial Health 
Series No. 4. Now York: Metropolitan Life iDBuronoe Co., 1046. 



VISUAL SKILL TESTS 


103 

Screener, and the Ortho^Rater, IDach of these instruments pro- 
vides for presenting to the employee a series of “targets" in 
order to measure his visual performance. The targets employed 
and the visual skills measured vary somewhat from instrument 
to instrument. The manufacturers of the instruments claim 
administration times for their batteries that range from three to 
five or six minutes. Other differences pertain to methods of 
installation, operation, and administration rather than to instru- 
mentation. 


THE VAIIDATIOir OF VISION TESTS 

Testing the Test. — As outlined in Chap. II, no teat should 
be used for the selection and allocation of employees or appli- 
cants until that test has been validated or tried out with em- 
ployees on that particular job. Vision tests are no exception to 
this rule. Figure 2-2 cited earlier demonstrates not only what 
can happen but what has happened when wrong vision testa 
were used to establish job standards. The establishment of 
visual standards on any basis other than statistical not only is 
apt to limit the available personnel without improving those 
hired but may actually result in the placement of inferior em- 
ployees on the job. Published results which demonstrate signifi- 
cant relationships between measured visual skills and job success 
in industry are almost entirely based upon research conducted 
with the Ortho-Rater,^ and it is the only one of the instruments 
upon which there is published reliabiUty data on industrial 
workers. Since all of the following statements show relation- 
ships between job success and scores made on. the Ortho-Rater, 
a more complete description of the instrument is desirable. The 
Ortho-Rater is illustrated in Fig. 8-8 and includes twelve distinct 
visual skill tests, each of which has been proved to measure a 
function that is important for successful job performance in 
several types of occupations. Seven of the twelve are given at 
the optical equivalent of twenty-six feet, and the remaining five 
at thirteen inches. The tests are 

^ Distributed by the Industrial Vision Department^ Bauaoh and Lomb Optical 
Co., Eochester, N.Y. 



l4 PRINCIPLES OF PERSONNEL TESTING 

At twenty-six feet : At thirteen inches : 

Verticd phoria Acuity, both eyes 

Lateral phoria Acuity, light eye 

Acuity, both eyes Acuity, left eye 

Acuity, right eye Lateral phoria 

Acuity, left eye Vertical phoria 

Depth perception 
Color vision 



Flo. 8-8— The Ortho-Rater, a nondiognoatie tesUng iDsirument. (Cotir(e*v o) 
Rausch and Lomb Optical Co.) 

The instrument is nonclinical and is usually operated by laymen 
who are trained in tlie use of a standard testing procedure. 

The Individual Profile. — ^As each employee or applicant is 
tested, a visual profile similar to the one in Fig. 8-9 is prepared. 
When job standards are established, it is quite easy to determine 
whether or not a given individual meets those standards. 

Distance Acuity. — Coleman ‘ has reported a study in which 
the importance of the distance acuity test for both eyes in con- 

^ Ohjihan, J. H. Vision teats fmr botter utilisation of manpower, j'oet. 
Mgml, and Mainl,, July, 1944. 



VISUAL SKILL TESTS 


Mcg3[U[xi3LJi3njn.j[TOR CI 

FSTwirnWHIHlnalnroogll 


DAU5CH ft LOMB OCCUPATIONAL VISION TESTS 

WITH THV outho ratbr 


ITAlCHCKf 


NOTEi 

VISUAL PEfiFORMANCB PROFILE 


• ? a • 10 II It 13 14 IS 



• i 7 • • 

B 9 10 II It 11 14 IS 


BAUSCH h LOMB OPTICAL CO 
CATALOOUC NO 71 1 1 4S 




b 


Fig. 8-9. — Sample individual record card Bhowing visual profile of one em 


















PRINCIPLES OF PERSONNEL TESTING 


106 

nection with ninety-seven milling-macliinc operators is demon- 
strated. The operators were rated as superior, good, fair, or poor 
by the supervisors, and Fig. 8-10 based on Coleman’s report 
indicates the proportion of each rated group Uiat attained a teat 
score of 9 or more. Whereas only half, or 50 per cent, of the 
poor rated group met this standard, 84 per cent of the superior 
group attained this level. Further analysis of his results indi- 
cates that although 49 per cent of the whole group were rated 



Flo. 8-10 j— T ha proportion of cmployeeB rated euperior, good, fair, and poop 
who attained a distance acuity score of 9 or better. (From Coloman.) 


good or superior, if only those with far acuity scores of 9 or better 
are considered, 58 per cent received one of these better ratings. 

Near Acuity. — ^Many industrial jobs require superior near 
point acuity. Tiffin and Wirt* have reported a study showing 
the relationship between near acuity scores and the earnings of 
seventy-two electric solderers. These solderers had near acuity 
scores ranging from 7 to 15, and their piecework earnings aver- 
aged approximately 80 cents per hour. Figure 8-11 shows the 
percentage of those at each score level who earned 80 cents or 
more per hour for the period studied. The figure riiows that 
none of those scoring 7 or 8 received 80 cents per hour and that 
the higher the vision score the greater the proportion of high 
earners. About 65 per cent of those scoring 12 on the test were 
among the high earners. 

Worse Eye. — Some jobs may be done well by persons with 

. * 7^***|i Joseph, and Wisp, S. E. Dotermiiimg visual standards for indua* 
trial jobs ^ statistical metliods. Trona. Amer. Acad. Opktkal. and Ololar,. 
NovcubGi-Becembar, 194J>, 4.^6, 



VISUAL SKILL TESTS 107 

one eye or by persons who have low acuity in one eye. Others, 
however, require a minimum acuity in each eye. Some studies 
utilizing Ortho-Hater data have included the checking of “worse 
eye” acuity scores against measures of job success. Tiffin * has 
reported such a study, and Figs. 8-12 and 8-13 are reproduced 



Fig. S^ll^The rclationBhip between near acuity scores and earnings of electric 
solderera. (From Tiffin and Wirt.) 

from his report dealing with a group of piston ring inspectors. 
The operators on the job were rated as A, B, C, or D on the basis 
of production data. Fifty-three per cent of the whole group 
were A or B operators, and Fig. 8-12 shows the proportion at 
various levels of distance acuity (worse eye). Tiffin points out 
that there is a relatively slight change in the proportion of A 
and B operators in the 1 to 5 bracket but that for higher scores 
there u marked percentage increase and that approximately 80 

^ TmnN, Joseph. Tho use of visual data oa an aid to increase production and 
efficiency. Tram. Amer. Aead, Ophthal, and Otoktr., January-Februaiy^ 1944. 


108 PRINCIPLES OF PERSONNEL TESTING 

per cent of those scoring H were superior. Figure 8-13 shows the 
corresponding pattern for near acuity '(worse eye). Whereas 
essentially the same pattern is found, the obtauied percentage is 



Fin 8-12 --The iclutionship between rated job performance on piston-ring in- 
apeotion and distance acuity score for worise oyc. {From Tiffin,) 



Fra. 8-13.— The relationship between rated job performance on piston-ring in« 
spectioQ and near acuity test score for worse eye. (From Tiffin,) 

even greater among the high scorers, 96 per cent of those scoring 
11 having been considered A or B inspectors. 

Phoria.— The phoria tests, sometimes referred to as the pos- 
ture tests, measure the position that the eyes tend to assume in a 
condition of physiologic rest. In the Ortho-Rater both the 



VISUAL SKILL TESTS 109 

lateral phoria test ' and the vertical phoria test are constructed 
so that orthophoria (the tendency for neither eye to deviate) is 
indicated by tlie middle scores. Consequently, with most jobs 



Fio. 8-14r-Thc relationship between near vertical phoria anti rated bucccbb of 
ninety-five milling-inachiDc operators. {From Ttffin and Wirt.) 

where muscular balance is important, middle scores constitute 
the acceptable range and extremely high or low scores constitute 
the unacceptable range. Figure 8-14 taken from Tiffin and 
Wirt,® however, pertains to a job that is unique in this respect. 
The figure based upon ninety-five milling-inachine operators 
shows a greater proportion of high rated operators among those 
whose eyes are well balanced vertically or whose right eye tends 
to tilt upward slightly. A small degree of right eye upward 
deviation can be tolerated, but no tolerance for the left eye is 
indicated. This fact, while probably associated with the head 
position assumed by many operators, is still another example of 
the need for the statistical determination of visual standards 
for each job. 

Color. — Color vision is known to be important in many jobs 

1 Wirt, S. Edoar. Studies in industrial vision. I. The validity of lateral 
phoria measurements in the Ortho-Rater. /, appL Pst/choL, 1943, 27, 217-232. 

* Tiffin, Josbvh, and Wirt, S. Eooar. The importance of viEual skills for 
adequate job performance in industry. /• consult. Psi/cAol, 1944, 8, 80-89. 


110 PSmCIPLES OP PEBSONNBL TESTim 

where obviously the operator must make color discriminations 
as is the case of certain printers who work with colored inks. 
Less obvious is the association of poor job performance with 
color-discrimination deficiency in many jobs where no logical job 
analysis would reveal the fact. Stump * showed a relationi^ip 
between accident information and OrUw-Rater color tost scores. 
Accident records of a gi'oup of employees were examined and 


Group A 
Group B 
Group 0 



Fio. 8-15,— Proportions of an accident^ree group (A), a acoi- 

dont group (B)» and a sorioua injury group (G) who attained and did not nitain 
a oiiticnl score on the color-vision test. (Frotn Stump*) 


three classifications were set up: Group A, accident free; Group 
B, high frequency; and Group C, serious injury. Figure 8-15 
based upon Stump’s data shows the proportion of each group 
attaining and not attaining a critical score of 4 on the Ortho- 
Rater color tests. Eighty-three per cent of ihe accident-free 
group passed the standard^ but only 69 per cent of the serious- 
injury group attained the standard. These differences, though 
not large, are significant and have been found in many jobs that 
do not appear to demand color vision. 


THE TISUAX. PROFILE 

Job Variability. — ^It has been falsely and expensively assumed 
that all industrial jobs have required one “good vision,” but it is 
now known that jobs differ greatly in their specific visual require- 
ments. When the validation techniques presented in Chap. II 
are applied to each of the twelve Ortho-Rater tests for a number 
of jobs, marked difference from job to job are found. Far 

^ firuMPj N. Frank. Spotting accideDt-prono workers by vision testa. Pact, 
Momi* and Mavni,, Jun^ 194& 



ViaVAL SKILL TK&TS Hi 

acuity, both eyes, is important on one job and not on another, 
whereas near lateral phoria is important on the first job but not 
on the second. The result is considerable variation from job to 
job in the statistically determined visual patterns which can best 
be shown as job profiles. 

Job Profiles. — Just as visual profiles for individuals are pre- 
pared, visual profiles for jobs ' cmi fdso be prepared. As each of 



Fio. 8-16 — Job profile for eleotrio solderers. Scores in shaded areas are uaao- 
oeptable. (From Slump.) 


the twelve tests is subjected to the validation procedure in con- 
nection with a specific job, those tests which ore significant for 
that job are identified and appropriate "cutoffs” or critical scores 
are established. By shading the unacceptable areas and leaving 
the acceptable zones or ranges white, it is a simple matter to 
match any given individual’s profile against the profile for that 
job and to determine to what extent he possesses the visual 
characteristics of the more successful employees on that job. In 
actual practice the job profiles are prepared on transparent tem- 
plates so that individual profiles may be placed underneath and 

*■ Wav, S. E. Stotisticid laboratory for vision tests at Purdue tTniveisity. /. 
appt. PsycAot, 1040, SO, 364-S68. 









112 PRINCIPLES OF PERSONNEL TESTING 

a quick determination made. Figures S’lG and 8-17 show the 
statistically determined job profiles for an electric soldering 
operation and a milling-machine operation, respectively. A test- 
by-test comparison indicates that the individual whose profile 
appears on page 105 as Fig. 8-0 meets the standards for electric 
soldering presented in Fig. 8-16; in every case his scores fall in 



Ski. 8-17;— Job profile for tnilling-machmo operators. Scores in shaded areas 
ore unacceptable. {From Colenutn.) 

the white area, and they never fall in the shaded area. Quite a 
different picture appears when this same individual profile is 
matched against the job profile for milling in Fig. 8-17. Note 
that this individual fails to meet the standards on five different 
tests: far vertical phoria, far acuity both, for acuity right, far 
acuity left, and near vertical phoria. Other things being equal, 
this individual as an applicant is a poor risk on the milling job 
because he does not possess the visual characteristics of the bet- 
ter operators. As a present employee, there is a good chance that 
his job performance can be improved by professional eye care. 
One point is quite clear, however: The visual demands of jobs 
differ to such an extent that almost any individual's visual profile 
will fit somewh^e. 

Electric Solderers. — The job profile for electric solderers 




VISUAL SKILL TESTS 


113 

shown in Fig. 8-16 is reproduced from Stump’s report.^ These 
solderers were paid on a piecework plan, and Fig. 8-18 shows the 
percentage at each income level who met the standards. Of 
those who earned 60 cents per hour or less (even though they 
were paid a minimum wage) 40 per cent met the standards shown 
in Fig. 8-16, whereas of those earning 90 cents per hour or more 


g, SO-105 
I 5 SO-S5 
J65-76 
I" £5-SO 



60% I 




Per cent not Per cent eat- 

eoliefying profito Ufying profile 


Fig. 8-18. — ^Percentage of electric solderers in each earning bracket who passed 
and failed visual standards presented in Fig 8-16. (From. &tump.) 



Per cent not eotiefytng profile 


Fig. 8-10.— Percentage of electric solderers in three quality perfomumce 
brackets who passed and failed visual standards presented in Fig. 8-16. (From 
Stump ) 


96 per cent met the standards. The figure also shows a consist- 
ent pattern for the other pay levels. Stump further reports that 
all those employees who failed to meet the job profile pattern 
averaged 60 cents per hour whereas those who did meet the stand- 
ards averaged over 80 cents per hour; in other words, those who 
meet the standards earn on the average about 34 per cent more 
than do those who do not meet the standards. 

Supervisory ratings on quality of work performed by the sol- 
derers were available, Figure 8-19, also from Stump’s study, 
shows that 89 and 88 per oent r^pectively of those rated high 
and average on quality met the standards but only 42 per cent 

1 Stump, N. Pbank. Vision tests predict worker capability, ilfpmt. and 
ifauit., 1940, 104, 121-124. 



114 PBINCIPLS8 OF PERSONNEL TESTING 

of the low-quality employees met the standard. Stump further 
reports that for a thh’teon-week period studied, those failing the 
standards averaged 28.6 half days of absence as compared with 
17.1 half days for those who met the standards. 

Milling-machine Operators.— The job profile for milling- 
machine operators shown in Fig. 8-17 is from a study reported 



Flo 8-20r— Percentage of all milllng-xnachino operators who were rated in four 
oatogoriGs (solid bars) vs, pciccntage of those who met visual standards (Fig, 
8-17} who were so rated, [From Coleman*) 

by Coleman.^ These opei'ators were rated as superior, good, fair, 
and poor, and the shaded bars in Fig. 8-20 shows that the per* 
centage distribution among all of the operators was 19, 31, 32, 
and 18 per cent, respectivdy. However, when only those meet- 
ing the standards are considered, percentages of 26, 36, 21, and 
17 per cent represented by the black bars were obtained. A sig- 
nificantly higher numb^ fall in the good and superior brackets 

1 CouMAK, J. H. Vision testa fov better uUHnUon of manpower. Foot, 
Mgmt. and Maint., July, 1944. 








VmAL SKILL TESTS 


116 

and a significantly amfiiller percentage fall in the fair and poor 
bracket when only those meeting the standards are considered. 

Piston-ring Inspectors. — ^Figure 8-21 shows the job profile that 
was statistically established for a group of piston-ring inspec- 
tors.^ Although only four of the tests “came through” in this 
particular study, Fig. 8-22 shows an inci'ease in the proportion 


FAR 


2 VCAIICAI. 

D 

■ 

1 

X 


8 

4 

S 


a 

■ 

1 

B 

■ 

B 

C UTCRAt 

B 


8 

8 

4 

8 

8 

7 a 

B 

10 

B 

D 

M 

D 

B 

BOTH 


o t 

8 

8 

4 

8 

8 

7 a 

8 

10 

11 

u 

13 

14 

18 

H RIGHT 

B 


a 

10 

11 

12 

13 

14 

15 

S LETT 

B 

mMmWMm 

8 

10 

11 

12 

13 

14 

IS 

UNAIPKO 

1 
















D 

o 


t 

8 

4 

S 

a 

7 

a 

0 

10 

II 

12 

mm 

Q 

0 1 

2 

8 

4 

8 

8 

7 a 

8 

10 

11 

12 

18 

14 

15 


NEAR 


90TH 
; RIOHT 

I ICfT 

r 

UNAIDED 


VliTICAL 
L ATI I At 


II 12 IS 14 IB 
O I I » 4 8 » 7 • I 10 11 12 IS 14 IB 

O I 2 8 4 8 • 7 t f 10 II ia IB 14 tl 


113 4 

2 8 4 5 8 7 


mmmmm 


Fio, ^21^--Job profilo for piaton-ring inspcctorB. Scores in Bhaded areas are 
unacceptable. (From Ttffln,) 


of A and B operators and a decrease in the proportion of C and 
D operators when only those who meet the standards are con- 
sidered. 

Drill-press Operators, — ^Figure 8-23 shows tlie standards * of 
drill-press operators using jigs. Here only two tests, lateral 
phoria near and lateral phoria far, were identified as being impor- 
tant. Figure 8-24 shows in A, B, C, and D operators the propor- 
tional shift that occurs when only those who meet the established 
visual standards are considered. Note the increase in A and B 
operators and the decrease in 0 and D operators. 

The Validation Approach,— For furtlier emphasis it seems 
desirable to point out again that tiie visual standards for the four 

1 Tims, JoBBPa. Vision sad industrial jnoducUon. lUum. Engm^ 1S45, 40, 
230-267. 

> Ibid. 











116 


PRINCIPLES OP PERSONNEL TESTING 



Pio. 8-22.— Percentage of all piston-ring inspectors who were rated in. four 
categories vs. percentage of those who met visual standards (Pig, 8-21) who were 
BO rated. {Pram Ttffin,) 



Fig, profile for drill-press operators. Scores in shaded areas are 

acceptable. {Prom Tiffin,) 


VISUAL SKILL TESTS II7 

job daasifications discussed here are not arbitrary estimates but 
have been systematically arrived at by means of tho general vali- 
dation procedure discussed in Chap, II, In every instance, good 
and poor employees have been identified by one or more of the 
means listed in Chap, III, and the performance of each group on 



Fia. 8-24.-^FercentaeeB of all drill-prcss operatoiB who were rated in four cat- 
egories vs. percontage of those who met visual standards (Fig. 8-22) who were so 
rated. (From Tijfin.) 

each of the twelve tests was compared. Only those tests which 
showed differences between these two groups have been included 
in the standards. The variability of the standards from job to 
job is quite apparent. Already standards ‘ for between 700 and 
800 job classifications have been established. 

Visual Job Families . — A more recent development in the 
establishment of standards has been extremely useful in those de- 
partments or plants where there is a small number of employees 
in any one job classification, maldng the statistical establishment 

^ Theso have been established by the Occupational Analysis Laboratory» 
Division of Applied Paychology, Purdue University. An unknown number have 
been established by industries themselves. 



118 PRINCIPLES OF PERSONNEL TESTING 

of standards for these individual classifications practically im- 
possible. Tecliniques beyond the scope of this discussion have 
been developed whereby different jobs in the same department 
with similar visual demands can be grouped together into visual 
job families and standards established for the family rather than 
for the individual classification. This grouping procedure which 
is a statistical process sometimes classified jobs in a particular 
situation into two visual job families with quite similar stand- 
ards, one of which is more rigorous than the other. 

SAFETY AND VISION 

Accident Proneness. — ^The accident literature presents much 
evidence in support of the statements that a small proportion of 
individuals have a high praiiortion of the accidents, that this 
small group has more accidents than can be accounted for on the 
basis of chance alone, and that tlie more accidents a given in- 
dividual has bad during a previous period the greater his proba- 
bility of repeating during the next period. These persons who 
have more than their sliare of accidents have come to be called 
"accident prone,” and accident proneness simply describes the 
fact that an individual has more accidents. 

Vision as an Accident Cause. — ^Although all of the character- 
istics of accident-prone employees are not known, sufficient prog- 
ress has been made that Stump ^ has estimated on the basis of 
evidence that about one-fourth of industrial injuries could be 
eliminated if proper standards of visual performance were estab- 
lished and adhered to. Stump,* in another report, describes the 
establishment of OrtJw-Rater visual standards using degree of 
accident proneness as a criterion. By means of visual standards 
so established, it was possible to classify employees on the basis 
of vision scores alone. 

Wirt and Leedke * have reported significant relationships be- 

^ SriTMPf N. Frank. Visual functions and safety. National Safety News, 
Juno, 1044. 

« Ibtd. 

« Wirt, S. Edgar, and Lhsdkb, Hazrl N. Skillful oyes prevent accidents. 
Annual News Letter, Industrial Nunaing Section, National l^fety Council, Novem- 
ber, 1946. 



VISUAL SKILL TESTS 119 

tween vision scores and the accident records of paper-machine 
operators and tradesmen. In the case of the paper-machine 
operators, fifty-two men who had a record of one or more serious 
accidents and another fifty-two with no serious accidents were 
matched on the basis of age and experience on the job. The 
Ortho-Rater scores of these men were matched against the job 


Those meeting all 
visual standords 


Those not meeting 
all visual standards 





Per cent with | 
serious accidents 


63% 


■■ 33 % 

Per cent with no 
serious oecldenfs 


Fig. 8-26^PerGentage of papcr-maclnne operators meeting visual standards 
and not meeting visual standards who had serious accidcnlo. (From 'Wxrt and 
Leedke.) 


Per cent with 2 or more I Per cent with no 
serious accidents serious accidents 

Fig. 8 - 26 ^PerGentage of millwrights meeting visual standards and not meclr 
ing visual standards who had two or more serious accidents. (From Wirt and 
Leadke ) 

profile previously established in accordance with the procedure 
already outlined. These 104 employees wore then classified as 
having met or not having met all visual standards for that job. 
Figure 8-26 shows that whereas 67 per cent of those not meeting 
all visual standards had one or more accidents, only 37 per cent 
of those who met the standards had accidents. Similar results 
were obtained in the study involving a group of ninety-four 
millwrights and other tradesmen. Figure 8-26 shows that 
whereas 81 per cent of those who failed to meet all visual stand- 
ards had two or more accidents, among those who did meet every 
standard, 64 per cent had two or more accidents. 

Safety as a Measure of Job Success.— As pointed out in Chap.’ 
Ill, measures of job success that are used in test-validatioji 


120 PRINCIPLES OF PERSONNEL TESTINO 

procedures frequently reflect the nature of the personnel or pro- 
duction problems which are associated with a given job or a 
particular plant. Similarly the accident problem is one that is of 
considerably greater importance on certain jobs and in certain 
industries than in others. The importance of accident records 
should not be overlooked when ways and means of classifying 
employees for purposes of test validation ore being sought. 

Selection and Placement. — Since this book deals with psycho- 
logical tests for purposes of personnel selection and placement, 
the emphasis in this chapter has been primarily on selection and 
placement. However, no area of psychological testing is any 
more far-reaching in its implication for all-around personnel im- 
provement than is vision testing. It seems important to discuss 
related phases of a total industrial vision program. 

Professional vs. Nonprofessional Services. — ^The relation- 
ship between the lay tester and the professions has been clearly 
seen by ICuhn.^ A pioneer in the medical profession in the use of 
job analysis in the correclbn of visual defects, she has demon- 
strated professional participation on a high level. There is no 
conflict between professional activity of this sort and those func- 
tions which can adequately and effectively be performed by the 
Israined lay test administrator. Those types of testing which 
ore clinical in nature are clearly among the activities that must 
be performed by the ophthalmologist and optometrist. 

Referrals. — The lay tester cannot diagnose or prescribe. 
Therefore it is desirable that he refer to ophthalmologists and 
optometrists persons whose visual skills do not fall within the 
acceptable limits that have been established by statistical means 
for that job. Whether the plant has a full-time medical director, 
employs an optometrist or an ophthalmologist as a consultant, or 
leaves the employee to seek in the community the service that 
he needs has little bearing on the fundamental principles of 
operation. Most important of all is the fact that the professional 
man be made thoroughly familiar with the testing procedure, 
the benefits of lay testing, ajid the meaning of job standards as 
statistically established. 

KtibM, Hbdwiq S. Indvtlrial opkthdtnologv. St, Louia; Mosby, 1944. 



VISUAL SKILL TESTS 


121 

Safety Eyewear. — Whitney ' has reported that in one plant 
injuries to the eyes constituted 12 per cent of all of the injuries 
for a six months’ period. The importance of eye protection 
through safety eyewear is well known to safety engineers. Logi- 
cally an eyewear program should be tied in with the total pro- 
gram. Par too often safety goggles are more or less forced on the 
employee without any regard for the effect that they might have 
on his visual performance. Vision and safety are related; there- 
fore attention to eye protection without consideration of eye per- 
formance is unsound. A safe employee may even be mode an 
unsafe employee tiirough madvertent intei’ference with his vis- 
ual performance. Feinberg and Sewell ‘ have described one five- 
point program as follows: 

1. Preliminary testing of twelve visual skills of each employee 
to determine those in need of further examination. 

2. Provision of a complete eye examination for those requii'ing 
it as indicated by above tests. 

3. Provision of safety glasses, either plain or prescription, cor- 

rectly fitted by a company optician, without cost to tiie 
employee, with the provision that he would be responsible 
for the glasses and that they remain the property of the 
company. ^ 

4. Extension of preliminary visual testing to the employment 
office for preplacement evaluation of applicants. 

5. Furnisliing safety glasses to all visitors to the areas covered 
by the directive. 

Eye protection • is a scientific matter, and any eyewear pro- 
gram rightly deserves the guidance of a competent professional 
man. 

Job Improvement. — ^Bven though ^es were made before jobs, 
many jobs have been designed or engineered with little or no 
thought of the demands that they make on the human being. 

^ Whitnet, L. HolijANd, Industrial firet aid— eye injiiiies, Indtulr. Med., 
1946, 14, 337-338 

* FniNUBRa, IlniHAHD, AND Sbwbll, John C, Eyesight through foresight. 
Afalional Safely Nem, ianuaty, 1046, 

* I'or a oomprehensive discussion see Kvhn, op. eU,, pp. 173-221. 



122 PRINCIPLES OF PERSONNEL TESTING 

Many jobs are capable of visual simplification. The discussion 
of tliis problem is beyond Uic scope of this book, but it should 
be pointed out that the increased emphasis on work simplifica'* 
tion as well as the work that is being done in the setting of visual 
job standards is sufiiciently spotlighting this phase of engineer- 
ing so that more and more bought is being given to the em- 
ployee and his eyes. 


SUMMARY 

Vision testing is a highly important phase of the total per- 
sonnel testing program. There is no one “good vision," and it 
has been demonstrated that jobs do not all make the same visual 
demands upon the employee. There is a great need for tlie appli- 
cation of the same kind of validation procedures to vision tests 
as have been advocated for other tests. There is a distinction be- 
tween clinical and nonclinical tests; and whereas the former 
should be adrainisteretl only by an ophthalmologist or an optome- 
trist, the latter can be administered to advantage by the layman 
provided he is adequately trained. There is a real opportunity 
for coordinating the vision-testing program and those phases of 
the safety program winch encompass vision. 



CSAPTBR IX 

TESTS FOR MECHANICAL AND OTHER 
MANUAL WORKERS 


The group of employees classified os mechanical workers con- 
stitutes the largest single category of industrial personnel. Tests 
intended to predict success on jobs in this area are consequently 
of extreme importance. The purpose of this chapter is to report 
significant validity studies in this field and to enumerate and 
describe certain available tests. 

The Measurement of Mechanical Ability. — ^The terms me- 
chanical aptitude or medumical ability are used with such a vari- 
ety of meanings that they are almost meaningless without some 
attempt at definition. So-called ‘'mechanical” jobs range 
throughout the various occupational levels, and the term mechan- 
ical aptitude is frequently used to describe the individual who 
tends to excel, regardless of the occupational level. Although 
research has not identified the components of mechanical ability, 
it is quite probable that there are three reasonably distinct com- 
ponents.^ 

Probably the most important component is associated with the 
capacity or the ability to understand mechanical relationships 
and mechanical processes of various sorts. Engineers are probar 
bly at the very top of this particular component. 

Manipulative ability* constitutes a second component. It 
probably embraces the various dexterities and muscle coordina- 
tions; and while certain research has demonstrated low inter- 
relationships between measures of certain of these abilities, it 

^ Bkhhbit, GixwaB IC., mo Cruiksimxk, Iturii M. A tummarif oj manual 
and mechanic^ abUily teats. New York. PayclioloEical Corp,, 1042. 

* Long, W. F., and Lawsiib, C H , Jr The effeoUve use of mumpulative 
tests in industry. Psyefiol, Bull., 1047, 44, 130-148. 

128 



124 PRINCIPI^S OP PERSONNEL TESTING 

may be that there is a generalized manipulative or muscular 
factor. 

Still a third element Uiat may be included in an analysis of 
mechanical aptitude is involved in the motor abilities of strength, 
speed of movement, and endurance. But whether future re- 
searches support or refute this classification, it is nonetheless Use- 
ful at the present time for thinking through the use of tests for 
industrial purposes. 

Tests of Mechanical Ability. — ^Tests of mechanical ability may 
be classified in a variety of different ways, one of which is in 
terms of the nature of the testing materials themselves. Tests 
classified in this way fall into one of two categories: paper-and- 
pencil tests and apparatus tests. Certain elements can be 
measured cither by apparatus or by papcr-and-pencil tests, but 
eadi type has unique purposes for whicli it alone can be used. 

Tests of mechanical ability may also be classified in terms of 
their function or intended function. Bennett and Cruikshank * 
list four groupings which seem to include all paper-and-pencil 
tests. These groupings are listed below. 

Information . — Tests of this sort range all the way from those 
which measure general background knowledge such os basic tool 
information to rather detailed tests of specific trade information. 

Spatial Relations . — ^Tests that measure space and form percep- 
tion have been shown to correlate with ability to perform some 
kinds of mechanical work. Some authorities contend that tests 
of this sort are really nonlanguage intelligence tests. 

Mechanical Comprehension . — ^Tests of this character measure 
more than factual knowledge as such. They tend to tap an in- 
dividual’s understanding of underlying mechanical principles and 
relationships. 

Interest . — ^Most interest tests such as those discussed in Chap. 
VII include mechanical areas or components. 

Certain of the apparatus tests may also be used to provide 
measurement in some of the aforementioned areas, particularly 
spatial relalions and mechanical comprdiension. 

^ Btmms AND CBtnxeHAHK, op. at. 



tests fob mechanical and other manual workers 126 


JOB KNOWLEDGE OR TRADE TESTS 

Definition. — ^When a teacher who has offered instruction in 
arithmetic wishes to evaluate the student’s learning in that field, 
he administers an arithmetic-achievement test. A job-knowledge 
or trade test is fundamentally an achievement test. It measures 
the individual's learned knowledge or skill in a particular occu- 
pational ai'ea in which he has had either experience or specific 
training. Although most trade tests are information tests using 
oral or written questions, some trade tests are manipulative in 
nature and measure performance skill. Regardless of their na- 
ture, however, they are primarily useful in evaluating an indi- 
vidual’s present adequacy in the job area. 

Oral Trade Questions . — A trade test utilizing oral trade ques- 
tions is basically no different from a pencil-and-paper trade test. 
It is sometimes argued that since people in certain occupations 
seldom engage in reading and writing in connection with their 
jobs, they are ill at ease and at a disadvantage when asked to take 
a written test. The validity of this argument has not been 
demonstrated. However, oral trade questions are useful, and 
Thompson ‘ in 1936 published an oral trade question manual. 
However, by far the most comprehensive and ambitious work in 
this area was done by Stead, Shartle, and associates* of the 
United States Employment Service. The Worker Analysis Sec- 
tion of this agency did extensive work in the oral trade question 
area and made impressive manuals available to the various local 
U.S.B.S. offices. They have not been made available to private 
industry and business for obvious reasons. One example, how- 
ever, is sufficient to demonstrate the usefulness of this approach. 
In the oral trade test for painters, fifteen questions were finally 
retained by methods similar to those described in Chap. II. In 
the validation process, these fifteen questions were asked of a 
group of expert painters, a group of apprentices and helpers, 

1 Thompson, Lomn A., Js. Intorvievf aids and trade queatiom for employ- 
ment offices. New York; Harper , 1938. 173 pp . , 

* Stead, Wiluam H., SHAaTi.n, Cabboll L,, et al. OcenpaUamd cott««eJt«9 

teeAniqnee^ New York: AmericoQ Book, 1940. 



126 PRINCIPLES OF PERSONNEL TESTING 

and a group of related workers. The term related workers was 
used to identify a group of workers who, though not actually en- 
gaged in the job in question, were in a favorable position for 
picking up certain incidental information about tlie job. Figure 
9-1 shows the pcrforninnce of these three groups on the fifteen 
questions. The possible score range was divided into three 
brackets, namely, 0 to 5, 6 to 8, and 0 to 15. The figure indicates 
that whereas 78 per cent of the expert painters scored in the 9 to 


14% 


Experl painlers 


.43% 


Apprentices 

r, . . ^ . -96% 

Related ■ ■■ ■■*■” 

workers 


8%1 

40% 



76% 


17% 


Percent scoring 
8 or lese 


4% 


Per cent scoring 
9-16 


Fiq 94 ^PcrccntagG of three classes of employees scoring 0 to 15 (solid bars), 
6 to 8 (open bars), ami 0 to 5 (shaded bars) on nii oral trade test for painters* 
(From Slcad and 8harlle») 


15 bracket, only 17 per cent of the apprentice group and none of 
the related worker group scored in the category. The lowest cate- 
gory, 0 to 5, included 96 per cent of the related workers, 48 per 
cent of the apprentice group, and only 8 per cent of the expert 
painter group. 

It is probable tliat oral trade questions will be most useful in 
the future when they are developed for use in specific industries. 
It is quite likely that the use of such tests for upgrading purposes 
has not yet been sufficiently explored. 

Written Trade Tests. — As indicated above, trade tests of the 
paper-and-pencil variety are essentially the same as oral trade 
questions. Because they can be administered to groups, they are 
more easily standardized and are more comprehensive. 

Trade tests are available for a number of occupations, and 
reference to Appendix C will reveal sources of supply. Specific 
mention is made here, however, of the series known as the Purdue 
Vocationed Testa ‘ and a later series called the Purdue Personnel 

^ Published by Science Research Associates, Inc., 228 Soutli Wabasli Ave., 
Chicago 4, 111, 


TESTS FOR MECHANICAL AND OTHER MANUAL WORKERS 127 

Tests.^ The fonner series includes the Purdue Test for Electri- 
cians, the Purdue Test for Machinists and Machine Operators, 
and the Purdue Blueprint Reading Test. The latter series now 
has in the process of development and standardization tests in 
the following ai'eas: oxyacetylcne welding, arc welding, automo- 
bile mechanics, and engine lathe operation. 

Trade Tests and Upgrading. — ^The problem of “who shall be 
upgraded” is one that is the basis of many grievances and man- 
agement problems. Where the union contract places the entire 
emphasis on seniority or when the usual phrase “employees shall 
be advanced to better paying operations on the basis of seniority 
provided they are able to do the work,” management is sometimes 
at a disadvantage. Even under the latter type of arrangement 
management has found it difficult when a particular employee 
with high seniority did not have the ability to do the work. The 
use of trade or competency tests in the upgrading process seems 
to be one part of the answer. The Collhis Radio Company is an 
outstanding example. This company through an agreement with 
the union has established trade tests for most classifications, and 
adequate test performance entitles the employee to a trial period 
on the job. According to the company’s testimony, “the use of 
tests has reduced failures during the trial period to less than 2 
per cent.” * Below are ten questions selected from the test in 
basic radio theory for use in upgrading employees to the classifi- 
cation Test Technician B. 

1. In a radio-froqucncy amplifier stngo having a plate voltage of 1»250 volts, 
a plate current of 150 milliampcres, a grid current of 15 tnillmmpcres, and 
a grid-Icnk resistance of 4,000 ohms, what is the value of the operating 
grid bias? 

2. Sketch a block diagram of a crystal-controlled transmitter, using a buSfer 
stage and high-level modulation. 

3. Wliy is the distributed capacity of a coil always incrcasod by the wax or 
other coating used for protection against moisture? 

4. How is the vacuum-tube plate current of an RF amplifier affected as the 
pfate-resonant frequency is varied? 

1 Copyrighted by the Purdue Research Foundation and distributed by the 
Divisioii of Applied Psychology, Purdue University, Lafayette, lud. 

* Rankin, BKRHAitn Notes on employee testing. Collins Signal, January, 
1947, 14, 23 (Collins Radio Co., Cedar Rapids, Iowa). 



128 


PRWCJPLEB OF PmSONNEL TESTING 



5. The above is a circuit of au RF amplifier. What is tlie fundamental dif- 
ference ill tlio action of the meter if tbo tube is operating Class O from 
what it would be if the tube was operating Class A? 

6. What are tlie principal output- voltage ripple frequencies in a full-wave 
rectifier? 

7. What is the i^clation between the direct-current power input of tlie plate 
circuit of the stage being modulated and tlie output audio power of tlie 
modulator for 100 per cent sinusoidal modulations? 

8. Sketch a block diagram of a suporlictcrodyne receiver allowing an audio- 
frequency stage, ladio-frequency stage, audio power-amplifier Stage, 
speaker, mixer, second detector, and intermediate-frequency stage. 

9. What is the sum of 26 cycles, 25 kilocycles, and 25 megacycles? 

10. What la a "parasitic oscillation?” 

As the use of trade tests for upgrading purposes becomes more 
and more accepted by labor mid management, it is quite possible 
that there "will be industry-wide developments in trade teats for 
that purpose. Until then, however, commercial tests will be 
available for only the most common trade areas. 

Interview Aids. — Of interest in connection with this discussion 
of trade tests is a series of so-called ‘Interview aids** that have 
been published in the Purdue Vocational Series. There are three 
tests in this series, Can You Bead a Micrometerf, C<m You Bead 
a Scale f, and Can You Bead a Working Dravmgt Each is a 
one-page test and requires less than ten minutes to administer. 
As the name implies, they ore useful in employment situations 
where it is necessary to evaluate the applicant*B skill in these 
simple but nonetheless basically fundamental skills. 

Trade Tests as Aptitude Tests. — ^When is a test a trade test, 
and when is it an aptitude test? This often-asked question 
probably cannot be answered to the satisfaction of all critics. 
Generally speaking, a trade test is a test designed to measure 
learning that has accrued from specific occupational training or 
experience. On the other hand, although everything that any 


TESTS FOR MECHANICAL AND OTHER MANUAL WORKERS 129 

test can measure must be learned or acquired in some fashion, in 
the strictest sense an aptitude test is one that measures aVilla or 
knowledges which have been more or less inadvertently acquired 
or are a function of maturity. More generally, however, the term 
aptitude is used to denote any test that is useful in predicting job 
or training success. For example, the Purdue Industrial Trains 
ing Classification Test discussed in Chap. IV is technically an 
achievement test. It measures one’s ability to apply fundamen- 
tal arithmetic in practical situations. However, as indicated in 
the discussion regai’ding the selection of electrician trainees, it 
predicts success in the training program better than any other 
single test. The Purdue Mechanical Adaptability Teat*^ die* 
cussed later is basically an inventory of an individual’s store of 
practical mechanical and electrical information. Since it predicts 
success fairly well in a number of industrial jobs, it can be con- 
sidered an aptitude test. The purpose of this discussion is to 
point out that whether a given test is or is not an aptitude test is 
in reality an academic question. What the personnel or employ- 
ment man wants to know is “Does this test adequately evaluate 
an applicant’s prior experience in the field?” or “Does this test 
help to predict success or failure of an individual who has not 
done this kind of work before?” In the remaining sections of this 
chapter no effort will be made to tag tests as this or that kind, and 
the position is here taken that the less one worries about whetlier 
or not Test A is an aptitude test or not the better off he is. The 
pragmatic question is “Does it work?” 

ASSBMBLESS, PACEESS, AND INSPECTORS 

Watch Assemblers.— Blum * used O’Connor’s Tweezer Dex- 
terity Test and Finger Dexterity Test in studying a group of 152 
female assemblers in a watch factory. Using turnover as a cri- 
terion and establishing critical scores on each of the two tests, he 
was able to demonstrate a significant relationship. Figure 9-2 

1 Lawshb, C. n, Je., Sbmakbk, Iebnb A., and Tifpin, Jobbph. The Purdue 
mecbanioRl adaptability test. /. appl. Ptychal,, 1046, 30, 443-453. 

s Blum, Miuton L. A contributioa to manual aptitude meaBuremeat in in- 
duBtjy. J. appL PapehoL, 1040, 24» 381-416. 



130 FnWaiPLES OF PERSONNEL TESTING 


from Blum’s data sliows the percentage of each tenure group who 
passed both tests. Whereas only 22 per cent of those who stayed 
on the job a weelc or less exceeded the critical score on both tests, 
67 per cent of those who remained on the job a year or more per- 
formed above the critical score level on both tests, Blum’s study 
furtlier demonstrated the relationship between performance on 
these two tests and job success as measured by the ratings of 
supervisors. An earlier study by Blum and Candee ^ was less 
conclusive. 


Mora Ihon one year 
4 fnos. to one yeor 


33%| 




43 % 


I 




I 


One week to 4 moG. 62 % 




Lees than one 
week 


78% 




% foiling one 
or both 


■■■■ 67 % 

■■ 36 % 

■ee% 

% passing both 
teste 


Fla. 9-2^Proporlioii of employees in various tamiro grouiw who mot or failed 
to meet tbo critical score on two doxtonty trsts. (From Blum^) 


Electrical-fixture and Radio Assemblers. — ^In research de- 
signed to select tests for identifying potentially good assemblers 
of electrical fixtures and radios, Tiffin and Greenly * studied three 
different job classifications. While they found different tests to 
be useful in the various classifications, each of the following was 
useful in. one or more of the classifications: finger-dexterity test, 
hand-precision test, intelligence test, and vision tests. In two of 
the classifications they found height and weight to be indicative 
of success. In one classification, tliat of radio assembler, a com- 
bination of finger dexterity, hand precision, near visual acuity, 
and color vision yielded a multiple correlation of 0.60 with effi- 
ciency ratings. In a recent study involving radio-assembly op- 
erators Goodman * has reported significant relationships between 
selected subtests in the MacQuarrie Test for Mechanical Ability 

^ Candeb, Beatrice, and Blum, Milton L, Report of a study dono in a 
watch factory. J. appl, Psychol, 1937, 21, 672-682. 

* Tiffin, .Tobrpii, and Greenly, R. J, Employoe Rolootion tests for clcotrioal 
fixture assemblers and radio asserablcrB, J. appl, PuyekoL, 1030, 23, 240-203. 

• Goodman, Charles H. The MacQuarrie lest for mcrhanical ability: I. 
Selecting radio aeacmbly operatora. J, appl, PayduA,, 1940, 80, 580-60S. 



fSSTS FOR MECHANICAL AND OTHER MANUAL WORKERS 131 

and estimates of training success. He reports selection effective- 
ness as being about 12 per cent better than chance. 

Radio Tube Mounters. — Forfano and ICirkpatrick report an 
investigation ^ involving twenty radio-tube mounters who were 
administered the Otis Selj-advimistervng Test of Mental Ability 
and two temperament tests, Washburne’s Social Adjustment In- 
ventory and tire Bell Adjustment Inventory. By a weighting 
scheme in which equal value was given to the mental ability test 
and the two temperament tests combined, a marked relationship 
between test scores and supervisory ratings was demonstrated. 
They conclude that although a person of low intelligence is likely 
to fail or do poorly on the job, higli mental ability does not ensure 
success. The two temperament scales when added to the mental 
ability test materially improve the ‘'batting odds” in predicting 
success as measured by supervisory ratings. 

Glove Assemblers. — ^Blum * in another study of assemblers in 
a glove factory devised the Blum Serving Machine Test which is 
scored in number of seconds with error allowances. Dealing with 
experienced operators engaged in “fordietting, thumb inserting, 
closing and pointing,” he found that various critical scores on the 
test eliminated larger percentages of successful operators than 
unsuccessful ones. 

Perhaps logically belonging in the next section but so closely 
related that it is included here is Shartle’s study* involving 
seventy-seven power-sewing-machine operators. Using a number- 
comparison test, a names-comparison test, and four parts from 
the MacQuarrie Test for Mechanical Ability, he demonstrated a 
significant relationship between combined scores and job success. 
Figure 9-3 shows that of those rated as most proficient, average, 
and least proficient, 86, 76, and 38 per cent, respectively, were in 
the upper two-thirds on test scores. 

^ Fobiano, Gborob, and Kidkpatrick, Forrest H. Intolligenca and adjust- 
meut DieasuTcments in the selection of vitdio tube mounters. J. appl, Psychol., 
1946, SO, 267-261. 

* Blum, Milton L. Selection of Bowing machine operators. J. appl. Psychol, 
1048, 27, 36-40. 

» Shabtlb, C. L. Psycholooical aids t» Ike seleclion of workers. Pcrsonnol 
Senes No. 60. New York; American Management AsBOoiation, 1041. 



132 PRINCIPLES OF PERSONNEL TESTING 

Food Canners.— Benge ‘ made a study of 173 food packers in 
a cannery in which he used a manipulative test involving the 
placing of dislcs or pegs. He divided the group into seventy-nine 
who were most satisfactory and ninety-four who were least satis- 
factory on the basis of meiit'rating scores and selected a critical 



Fia 9-3 ^Proportion of cacsh of thrco ability groups of powor-acwing-inaohinG 
operators who ^verc among tlie uppci* two-Chirds iu test por/ormancc. (^om 


Sharile^ 


Roiad mo9f sotisfoctory 


Roted leosi 
sotlsfoctory 


Z2% 



P«r cant below 
critical tcorc 


78% 


Per cent above 
critical score 


Fig. 0-4^FroportionB of the most satisfactory and least satisfactory groups of 
food packers who were above Uie critical scoro on a hand-dexton'ty test, (^om 
Benge,) 


score on the test. Figure 0-4 plotted from his data indicates that 
whereas 73 per cent of those rated as most satisfactory exceeded 
tlie critical score, only 28 per cent of those rated least satisfactory 
were above the critical score. 

Pharmaceutical Packers. — Ghiselli* studied twenty-six in- 
spector-packers in a pharmaceutical house whose job consisted of 
the following: 

1. Filling capsules, vials, and bottles wiHi serums, antitoxins, 
and similar biologic^ 

2. Stoppering the containers 

1 Bskob, Euobnb J. Use tcate !a solcoting personnel. Pood Packer, Deaem- 
ber, IQ44, 36-37. 

‘ Gkisklu, EnwiK 13. Teats for the aelcotion of inspaotoiypaokQn}. J, appl, 
Pevehol., 1Q42, 26, 468-476. 



TESTS FOB MECHANICAL AND OTHER MANUAL WORKERS 133 

3. Examining them for the presence of extraneous foreign 
material 

4. Labeling them 

6. Cartoning and packaging them. 

Using the combined ratings of the supervisor and the floorlady 
as a criterion, he secured coefficients of correlation with scores of 
tests as follows: Minnesota Paper Formhoard, 0.57; Minnesota 
Rate of Manipulation Test (turning), —0.40,' a pegboard, 
-0.50. 

Industrial Inspection. — ^The inclusion of the assembler-in- 
spector job discussed above makes some observation on industrial 
inspection pertinent. The word inspector, as every informed per- 
sonnel man knows, is used to apply to many classifications. 
Sometimes an inspector traces down defects in the wiring of a 
radio; sometimes he simply uses go and no-go gauges; and some- 
times he passes or rejects material on the basis of appearance 
alone. From the standpoint of personnel testing, the third classi- 
fication of visual inspection is one of the moat important areas 
and is treated in Chap. VIII. So diversified, however, is inspec- 
tion that a section has not been specifically assigned to it. In- 
spection is so often combined with other operations or is so 
dosely related to other jobs that to emphasize the inspection 
phase is inappropriate. Tiffin and Rogers’ * study involving as- 
sorting-room operators in a tin-plate mill is illustrative. They es- 
tablished scores on three vision tests and on the Purdue Hand 
Precision Test and, in addition, established a minimum weight 
of 118 pounds and a minimum height of five feet two inches. It 
seems obvious that height and weight have nothing to do with in- 
spection as such. But successful girls on this job had to handle 
large volumes of metal in tiie course of the day, and physical 
stamina is apparently related to height and wei^t. At any rate, 
height and weight were found to be important predictors. Many 
other jobs in the inspection category make important demands 
on the individual that are not implied in the word inspection. In 

1 Tmm, JoBBPB, AND Boonas, H. B. The eelectioa and training of inapectom 
Pertotmel, 1Q41, 18, 3-20, 



134 PRINCIPLES OF PERmNNEL TESTING 

anothCT study of inspectors Shuman ‘ found that selection of a 
particular type of inspector in the aircraft-engine and propeller 
industries could be significantly improved through the use of 
Bennett’s Test of Mechanical Comprehension, the Otis Self- 
administering Test of Mental AbUUy, and the Revised Minne- 
sota Paper Form Board, 

OPERATORS AND MACHINE ATTENDEES 

Aircraft-riveters and Sheetmetal Trainees. — ^The occupa- 
tional analysis section of the United States Employment Service^ 
in the process of developing approximately 170 test batteries, set 
up a three-test battery for aircraft-riveter trainees.^ The battery 
consisted of a pegboard, a finger-dexterity test, and a figure- 
copying test. In an experimental sample of fifty-one trainees in 
a national defense training program the battery yielded a correla- 
tion of 0.60 with the criterion. Further study showed that the 
third scoring highest on the battery as compared with the third 
scoring lowest was 26 per cent higher in number of rivets driven 
per hour. Hardtke,* working with four groups of aircraft sheet- 
metal trainees, set up test batteries involving “measures of dex- 
terity and spatial perception” and obtained multiple coefficients 
of correlation ranging from 0.36 to 0.65. 

Coil Winders. — ^Hayes* used two pegboarcls, O’Connor's Fin- 
ger Dexterity Test and Ihe Western Electric Pegboard, in com- 
bination with an evaluation of prior experiences os a predictive 
battery for coil winders. Figure 9-5 shows the relationship be- 
tween composite scores and estimates of learning speed. Cook * 
in studying the same job classification set up a manipulative test 
that demonstrated a significant relationship with success on the 

» Shuman, John T. The value of aptitude teats for factory workon ia tlio 
aircraft engine and propeller industries. /, appi. Psychol, 1046, 30, 166-160. 

3 Measuring occupational aptitudes. Oceupaiiomt, 1044, 22, 387-446. 

0 HAiuaxE, £. F. Development of an aptitude lest baliery for (tinsraft sheet 
metal trainees. Ph.D. thesis on file hi libnuy, Univcisily of Wisconsin, Madison, 
Wis., 1043. 

* Hayes, Eleanor Q. Selecting women for sliop work. Person, J., 1032, 11, 
60-86. 

" Cook, D. W. Psychohgteal aids in the selection of viorkers. Pcrsoimol 
Scries No. 60. New York; Amciioan Management Association, 1041. 



TESTS FOB MECHANICAL AND OTHER MANUAL WORKERS 136 

job. Figure 9-6 shows that Cook was able to establish a critical 
score above which 92 per cent of the better employees and only 
28 per cent of the poorer employees fell. 

37 Scoring 
MO or over 
35 Scoring 
100-109 

33 Scoring 
90-99 

57 Scoring 
80-09 

30 Scoring 
70-79 

16 Scoring 
69 or iowtr 

iOO 80 60 40 20 0 20 40 60 80 KX) 

Per cant fair or slow Isornars or did Par csnl quick Isornart 
not complete training period 



I Did not complete 
training 

I Slow learners 


Polr learners 


n Quick learners 


FiOi 9-3. — ^Relation between composite score on two doxterity tests and speed 
of learning for a group of 308 coil winders, (From Tiffin 


Beef employeee 
72 

Poorest ^ 
employeee 


Below criticol 
score 


Above critical score 


Fra. 0-6w — ^Proportion of best and poorest coil winders who attained and failed 
to attain a critical score on a manipulativo test. (From CooM 


Solderers. — Cook * reports an investigation involving solderers 
for whom he established a battery consisting of the Otis Bdj- 
adminiatering Test of Mental Ability, an apparatus monotony 
test, and a finger-dexterity test. The employees were classified as 
above average or below average in job performance, and Fig. 9-7 
shows that when he established the average score of the group 
^ Tiffin^ Josdph. Indmtrial Psychology. Now York: Prentice-Hall, 1647> 
p. m 

> Cook, D. W. P^yGkological mda in the eeleciton of mrJeen. Fenonnd Sories 
No. 50. New York; American Management Assoeiatiooi 1041. 



PRINCIPLES OP PERSONNEL TESTING 


186 

as his critical score, 89 per cent of the superior group on the job 
exceeded it \ 7 hereas none of the poorer group scored above 
average. 

Relay Adjusters. — In the same report ^ Cook presents the re- 
sults of a study involving electrical-relay adjusters. He divided 


Above ove. on |ob 
100% 

Below 
ove on Job 


11 % 


Per cenl below 
overage ecore 


0 


69 % 


Percent obove 
overoge score 


Fia 9-7 —Proportion of above- and belotv-averagc soldcrcrs who attained or 
failed to attain a composite critical score for a test battery, {From Cook.) 



91 % 


on tests 


Leoet successful on 
lob 


Most suceessfijt on 
job 


Fia, 0-8.— Proportion of relay adjustors scoring above and below average on 
tests who were considered most successful and least successful on tho job. {From 
Cook.) 


tile group into the upper and lower half on their combined scores 
on two tests including a monotony test. Figure 9-8 indicates the 
relationship between combined scores and job success. Note 
that among those who were above average on the combined tests, 
91 per cent were considei'ed successful on the job whereas of 
those below average on the tests only 25 per cent were in the suc- 
cessful group. 

Cable Formers. — ^Two maze tests and a tweezer dexterity test 
proved useful in differentiating between the more eflBoient and 
the less efficient employees engaged in a cable-forming opera- 
tion.* Figure 9-9 shows that of those above average on test 
scores 80 per cent were above average in efficiency and that of 
those below average on test scores 25 per cent were above average 
in efficiency. 

Bench Workers. — Job titles are frequently poor indicators of 
the true nature of the job, and the same title frequently is used 

^Jbtd. 

*Ibid. 



TESTS FOR MECHANICAL AND OTHER MANUAL WORKERS 137 

for designating a wide variety of different jobs. The title of 
bench hand or bench worker is no exception. Hayes* reports a 
study in which two dexterity tests in combination with an index 
of experience were related to rate of learning with 308 bench 


Above ove. score 


eox! 


|so% 




J 

^■■ 20 % 

I % above 0 


Below ove. 
acoro 

% below overogo in I % above averags In 
efficiency efficiency 

Fici. 0-0^ — Proportion of ubove^ and bclow^veragc cable formcre ecoring above 
and below average on masse and tweezer dexterity testa. {From Cook ) 


6 Scoring 
170-183 

34 Scoring 
I50*'I69 

61 Scoring 
130*149 

102 Scoring 
110-129 

66 Scoring 
90*109 

1 1 Scoring 
73-89 




m 




1 


100 SO GO 40 20 0 

Par cant fair or slow leorners or did 
not comploto training period 


20 40 60 80 

Par cant quick laarnors 


100 


■ Old not compislo ^ learners 

training ^ 

^ Slow laarnars Q Quicli learners 

Pia 0-10.— Relationship between composite score on two dexterity tests end 
speed of learning for a group of 308 bench hands. {From Ti^in?) 

hands. Figure 9*10 shows that the proportion of quick learners 
ranged from 100 per cent among the group having highest com- 
posite scores down to 12 per cent among the group having lowest 
composite scores. Hayes further reports a follow-up study of 
sixty-two new hires. At the end of their first six months of em- 
ployment the fifteen with the highest composite scores were per- 
forming considerably better on the job than the remainder. 

1 Hatids, op. ciL 
^ TirriN, op. dU p. 186. 



138 PRINCIPLES OP PERSONNEL TESTING 

Laundry Workers. — ^Althougli it has been the policy in this 
and other chapters to report only those studies in which objective 
validity data are presented, it seems desirable to deviate from 
that policy in order to cite Ten Brocck’s report ^ on the selection 
of laundry workers. No validity data are presented, but the 
author reports excellent results with a battery that includes the 
following tests: Minnesota Rate of Manipulation Test, Otis Em- 
ployment Test, Minnesota Vocational Teat for Clerical Workers, 
and Washburne’s Social Adjustment Inventory. Obvioudy there 
is a need for objective studies in the laundry industry. 

Punch-press Operators. — ^In her study’' previously cited, 
Hayes studied 254 operators of punch presses and similar nuir 
chines. Here again she used the same two pegboards plus an 
index of prior experience. Among those in the highest bracket on 
the composite score 65 per cent were considered quick learners, 
whereas only 20 per cent of those in the lowest score bracket were 
so rated. The miniature punch press described by Tiffin and 
Greenly* is an example of a trade test of the manipulative 
variety. While ability grouping validity data are presented, the 
authors demonstrate that the test differentiates between groups 
of people with and without punch-press experience. They fur- 
ther found that test performance was related to speed and ac- 
curacy ratings made by foremen, the coefficients of correlation 
being —0.55 and 0.63. They also report a correlation of 0.60 be- 
tween test scores and safety ratings. 

Paper-converting-machine Operators. — Jurgensen * reports a 
study involving 212 operators of paper-converting machines. In 
describing the job the author indicates that the operators were 
engaged in removing tissues from the macliine, inserting adver- 
tising matter, and placing them on conveyors. Characteristic of 
many machine-tending jobs, its emphasis is fundamentally on 

^ Ten Brokck, Delphinh L. How aptitude testa can aid you in employee 
BGlection. Laundry Age, Jon. 1, 1016, 86^8. 

> Hatbs, op. ctt. 

^ IWiN, Josbfh, and Ckeenly, R. Experimonis in iho operation of a 
punch press. J. apph Peychoh, 1939, 23, 160-460. 

*■ JnnoKNaBN, Ch^iFPORD E. KxtonBion of the Minnesota Rato of Manipulation 
TM. J. aypL P^ohoh, 1013, 27, 164-160. 



TESTS FOR MECHAUWAL AND OTHER MANUAL WORKERS 139 

packing. Jurgensen evaluated job adequacy by means of three 
supervisory ratings on each employee. Using certain extensions 
of the Minnesota Bate of Manipulation Test, he obtained a co- 
efficient of correlation of 0.61 with these ratings. 

Bar-mill Employees. — In one steel-mill study a battery of 
three tests was administered to each of twenty-eight applicants 
at the time when they were hired as bai^mill employees. Each 
of the tests, the Purdue Industrid Turning Classification Test, 


8 

a.* 

c 

^3 

o: 

2 

I 



Fra. 9-11 ^Proportion of each of five rated groupa of bar-mill employees who 
BGored abovo and bolow a critical score on tlie Pttrdtie Indvalnal Trainmff Classic 
ficalion Test, 


the Adaptability Test, and the Purdue Mechanical Adaptability 
Test, showed significant relatiotuships with supervisory ratings, 
the correlation coefficients being respectively 0.72, 0.59, and 0.57. 
The Purdue Industrial Training Classification Test was particu- 
larly effective as is demonstrated in Fig. 9-11. Note that when a 
critical score of 8 is imposed, 100 per cent of tiiose rated 4 or 5 ore 
above and all of those rated 1 and 83 per cent of those rated 2 are 
below. 


UACHINS-TOOL LEARNERS AND APPRENTICES 

Nature of Machine-tool Work. — Considerable confusion exists 
in the thinking of many people when they consider the require- 
ments of jobs that utilize machine tools. Probably no other 
single area of work activity embraces employees with such a wide 
range of talent. A lathe, for example, may be operated by a 
routine employee engaged in a highly repetitive operation which 
requires an absolute minimum of planning and judgment by the 



PRINCIPLES OF PERSONNEL TESTING 


140 

operator. Or the some lathe may be operated by a specialist, an 
individual who is quite expei't on all phases of lathe operation, 
who plans and executes the lathe work including the setup ac- 
tivity, but who professes no skill at all on other machine tools. 
And, finally, a lathe may be one of many tools utilized by an ex- 
pert machinist or tool- and diemakcr. To classify all of these and 
the many others that could be named together in the testing pro- 
cedure would be similar to grouping together all people who use 
pencils. True, the manipulative act of pushing tlie pencil is the 



Fia. 0-12 — ^Proporlion of machinc-tool-opomtor trainees above and below crit- 
xoal teat scores who were considered in the best and poorest group» {From Rosa,) 

same from job to job, but other aspects of the work tend to mag- 
nify or minimize the relative importance of pencil utilization 
from job to job. Likewise, the progression of jobs from the opera- 
tor level to the tool- and diemakcr level is such that although 
minimum manipulative skills arc important all along the line, the 
relative importance of the manipulative phase of the total job 
diminishes as the job level increases. Consequently, the battery 
of tests that will pick good macliine-screw operators will not 
necessarily and will probably not pick those most apt to succeed 
in a four-year apprenticeship program. 

Operators and Trainees. — Patten^ in an early study used a 
coordination test, a lock test, a box test, and a circle-centering 
test in a battery to forecast ability to operate a lathe as measured 
by proficiency in making five standard jobs on the lathe. Ross,* 
working with a group of adult trainees on various machine tools, 
administered O’Connor’s Finger Dexterity Test and by estab- 

^ pATTtiN, EvEnDTT F. An experiment in testing engine Jntho aptitude, J. 
appL Ptychol, 1023, 7, 16-20. 

* Bobb, LawnmicB W. Besults ot testing inAohine.4iOol trainceB. Penm. J., 
1043, 22, 303-307. 



TB8T8 FOB MECHANICAL AND OTHER MANUAL WORKERS 141 

lishing a critical score of 304 seconds was able to show a signifi^ 
cant relationship with job performance. Figure 9-12 shows that 
83 per cent of those above the cutoff were in the best category 
whereas only 17 per cent of those below the cutoff were considered 
best. 

Bennett and Fear,‘ working with operators of turret lathes, 
precision grinders, milling machines, and Bullard automatics, 
found a significant relationship between composite scores on 
Bennett’s Test of Mechanical Comprehension and a hand-tool 


BM»20%ontM»t 

86 % 

Poorest 30% 
on fsitt 


100 % 




Rated below 
Qvorogt 



Rated ovsroge or 
above 


Fio. fi»13^Proportion of maohme-operator groups Bearing high and lovr on a 
test battery who were rated above and below average. {From Bemelt and Fear ) 


dexterity test and ratings. Tls is demonstrated in Fig. 9-13, of 
those scoring in the upper 20 per cent on the two tests 100 per 
cent were considered average or above on the job whereas of those 
scoring in the lowest 30 per cent on the tests only 14 per cent were 
considered average or above. The authors further state that new 
men who were hired after the battery was installed were rated 
consistently higher than was the case prior to testing and that 
not a single new man hired since tests were introduced as part of 
the selection procedure has had to be dismissed because of lack of 
ability to do the job. 

Numerous English studies have dealt with the so-called 
“engineering operatives,” an example of which is Andrews’s 
study She administered to 122 miscellaneous machine-shop 
workers’ testa designed to measure the following; intelligence 
control of movement, steadiness of movement, finger dexterity, 
accuracy of placing, bimanual coordination, and observation, 

1 Bsnnett, Georob K., and Fear, Richakd A, Mechanical comprehension and 
dexterity. Person. J , 1043, 22, 12-17 

* Andrnwb, Amy G. A year's experience of selection tests for engineering 
operatives. Oeeup. Psyeftol., 1044, 18, 126-130. 



142 PRINCIPLES OF PERSONNEL TESTING 

She reports that whereas 60 per cent of the whole group were 
rated as satisfactory, of the half scoring highest on the test that 
had been accepted 98 per cent were rated satisfactory. 

One test that shows promise is tlie Purdue Mechanical Assem- 
hly Test?- This is an individual apparatus test involving mech- 
anisms that are new to all testees. The author has reported 
validity coefficients as high as 0.55 between supervisory ratings 
and test scores of machinists and machinist’s helpers. 



Fia. Froportioa of a gi’oup of inachiuo operators rated high when vori*^ 
oua critical Bcores are employed. (Prom Lamko, Somanok, and TifiJi.) 

Screw-manufacturing Employees. — ^One of the newer trats 
that has been useful in identifying inexperienced machine oper- 
ators who are most apt to malce good on the job is the Purdue 
Mechanical AdaptabUity Test. The test, which is really an in- 
ventory of experience in mechanical, electrical, and related areas, 
was used with some success in connection with a group of forty- 
six operators in a plant engaged in the manufacturing of screws. 
The operators were rated by supervisors, and Fig. 9-14 from a 
study by Lawshe, Semanek, and Tiffin • shows the results. While 
approximately 37 per cent of the total group were rated high, the 
proportion gradually increases as higher critical scores are ap- 
plied. When the critical score of 85, for example, is applied, the 
proportion of high rated employees increases to about 48 per cent. 

^ Gunbt, M. B. The construclum and validation of a new iffpe of ynachani- 
cal assembly lest, FhJD. thcfiiSj Purduo University, 1042, 

■ Lawshh, G H., Jb., Sbmawbx, Ihbnb, and Tiffin, Joseph. The Purdue 
Meohaoiool Ad^tobility Teat. J. appL Psy<^ol, 1840, 28, 442^, 



TESTS FOR MBCHANICAL AND OTHER MANVAL WORKERS 143 

Tool Setters.— When there is a high degree of specialization, 
the tool setter becomes a key employee. Crissey ^ reports a study 
in which he used a battery consisting of tlie Minnesota Spatid 
Relatvms Test, a pegboard, smd a peg-turning test. Pigure 9-16 
shows the relationship that he obtained between composite 
scores and supervisory ratings. Note that among the one-third 
scoring highest on the tests, 69 per cent were in the third rated 
highest and that none were in the third rated lowest. Among 


Third Bcoring highest 3I%[ 


Middle third 

Third tooring 
lowest 


40% 

80% 60% 




% Ia low ond middle 



scoring third third 


Fia. fi-15^Proportion of three nitcd ability groups of tool Bettcra who made 
test BcoroB in the high third (solid bar), middle third (open bar)| nnd lower thiid 
(ahadod bar) of the group. (From Cnssey,) 


the third scoring lowest on the tests, 50 per cent were in the third 
rated lowest and none were in the third rated highest. Shuman* 
in his airoraft-engine and propeller-plant study dbowed signifi- 
cant improvement in the selection of tool setters by means of the 
Otis Self-admirastering Test of MenUd Ability, Bennett’s Ms- 
chanical Comprehension Test, and the Revised Minnesota Paper 
Formboard. 

Machinist Apprentices. — ^Pond* in making a general evalu- 
ation of a testing program for machinist apprentices compared 
the supervisory ratings of 163 apprentices who were hired with- 
out the use of tests and another 155 who were selected after test- 
ing had been added to the selection procedure. Figure 6-16 shows 
her results. Without tests, 61 per cent were considered superior; 
but when tests were used, 83 per cent were rated in the superior 
category. With respect to the type of apprentice selected she 

1 ChiiBBifiYi OitiiO L. The use of ienle in imptoving personnel proeedvree. 
Flinty Mick.: General Motoia, 1044. 24 pp. 

> Shuman, op, di. 

A Fond, Millighnt, What is new in employment testing. Fenon. J ,, 1932, 

11 , 10 - 16 , 


144 PRINCIPLES OP PERSONNEL TESTING 

says that "adoption of the minimum scores for tool-making ap- 
prentices has unproved the Quality of the group selected as much 
as was formerly accomplished in a year of trial in the course." 
When Shuman ^ applied his battery consisting of the Otis Self- 
administering Test of Mental Ability, Bennett’s Mechanical 
Comprehension Test, and the Revised Minnesota Paper Form- 


Setftciftd with iMh 

Selacted without 
tests 


39 % 


Percent roted 
Inferior 


| 63 % 
|6I% 

PsreenI rotad EUperlor 



Fra. fl-lG.r-Pn)portion of machine apprentices selected with and without tests 
who were rated inferior and superior. (From Pond.) 


board to a group of toolmaker learners he again obtained sig- 
nificant results, the Bennett showing the highest relationship 
with the criterion of any of the three. 


SERVICE ELECTRICIANS AND REPAIRMEN 

Electrical Troublemen. — ^Electrical troublemen, or trouble 
shooters, who “make quick and temporary repairs on any break 
that may occur in the power or lighting transmission circuits in 
the metropolitan area” were studied by Shartle.® His battery in- 
cluded, among others, the pursuit test and the blocks test from 
the McQuarrie Test for Mechanical AbUity, an arithmetic test, 
an electrical-circuit test, and an electrical information test. The 
troublemen were rated by supervisors and placed in A, B, and C 
classes. By establishuig a critical score Shartle was able to obtain 
the results presented in Fig. 9-17. Note that whereas of those 
rated C only 25 per cent were above the critical score, of those 
rated A all, or 100 per cent, were above the critical score. Also 
significant is the fact that when he classified these same trouble- 
men into two groups, one composed of those who had been in- 
volved in one or more accidents during a stipulated period and 
the other composed of those who had not, 67 per cent of the for- 

^ Sh0maN| op. cU, 

^ SitAitiTLEi, Carboll L, ^ fielcotiou test for electrical troublemen. Person. 
1932 , 11 , 177 - 183 . 



TS8TS FOR MSGIIANICAL AND OTHER MANUAL WORKERS 145 

mer were above the critical score as compared with 80 per cent of 
the latter. Although the difference is not large, it is sufficiently 
important to be considered. 

Dial Switchmen.— The job title, dial switchman, is used to 
designate maintenance men in a dial switchboard telephone office. 
Rossett and Arakelian ^ studied a group of these switchmen and 


Roted A 
Rated B 


Rated C 



Below critical score 


|e5% 

Above criticol score 


Pio. 9-17. — ProporLtou of three rated ability groups of Dlectrical troubJemen 
who attained or failed to attain a given cutiool score. (From BhatUei 



Fia. 9-18.— Proportion of dial switchmen trainees in each teat bcdtb brabkat 
who passed training course. {From Ro9%eit and ArakeUanJ) 


developed a fifty-minute test battery of ten tests including the 
following: electrical information, knowledge of diagrams and ap- 
paratus, adjustment of apparatus, and some maze tests. All 
switchmen attended a training school, and the authors obtained 
a correlation of 0.68 between test scores and rated success in the 
school. Perhaps even more significant is an analysis of the actual 
failures in the school. During the time of the investigation a 

^ Robsdit, Nathanibl E, Ain> AiiAxelian, Fbter. A test battery for the 
sekotioa of dial switohinen. /. appL Ptyehol^ 1939, 23, 368-366. 


PRINCIPLES OP PERSONNEL TESTING 


U6 

total of 222 men were involved of whom 43 were transferred to 
other departments because of failure. Figure 9-18 presents an 
analysis of these failures by test scores. Of those scoring from 10 
to 20, 17 per cent passed; or, in other words, 83 per cent were 
transferred to other work. Notice that the curve moves steadily 
upward and that among those scoring between 70 and 80, 100 per 
cent passed. 

Errors 
1-6 

7-12 

15-18 

19 or 
more 

Fxd. 9-10/— Proportion of elcctrioal to&tors and inspoctotn scoring at various 
levels on n perception teat who wero above average and below average on job 
efficiency* {From Cook,) 

Electrical Testers and Itwpectors. — Cook ' in the report re- 
ferred to used a perception test in a study involving a group of 
electrical testers and inspectors whose work consisted primarily 
of tracing wiring diagrams and locating trouble. Figure 9-19 
indicates that when the employees were divided into the half 
above average in efficiency and the half below average in ef- 
ficiency, a significant relationship with errors on the test was 
revealed. The figure shows that of those with six or fewer errors 
100 per cent were in the upper half and of those having nineteen 
or more errors 100 per cent were in the lower half. 

Ice-company Mechanics. — ^The Purdue Mechanical Adapto/- 
bility Test previously mentioned was administered to fourteen 
general mechanics, or “handy men,” in an ice company.® These 
same men were also rated on a five-point scale by the owner- 
manager. Figure 9-20 shows the results in a seattergram form. 
Not only did the highest rated man receive the highest score and 

1 Odok, op, oil. 

• Lawbhd, Sbmahbk, and TiFriN, op, eit. 




TESTS FOR MECHANICAL AND OTHER MANUAL WORKERS 147 

the lowest rated man receive the lowest score, but most others 
tended to fall in line. Note, for example, that a critical score of 
95 would pass all of those rated 4 or 5 and fail all of those rated 
1 or 2. 

Aircraft Mechanic Learners. — Jacobsen ^ reports a study in- 
volving several different classifications of aircraft mechanic 
learners which are grouped together here for convenience. All 


6 


.4 





• • 


2 


• • # • 


t 1 J 1 1 1 J I I I I- i 

70 75 80 65 90 95 tOO 105 IJO 115 120 
Test score 

]Fiq. Hclationship between cfHcionoy rating and Purdue Mecfumcdl 
AdaptahUxiy Tent score for fourteen ico-company mechanioa. {From Lamhe, 
B&manek, ond Ttffin.) 

learners were in ta^aining classes, and correlations between test 
scores and class proficiency are given. With aircraft instrument 
repairmen he used Pressey’s Senior Classification Test, Pressey^s 
Senior Verifying Test, and Bennett's Test of Mechamcal Com- 
prehension and obtained a multiple correlation of 0.61. Using 
the same battery with a group of aircraft repair mechanics, he 
obtained a correlation of 0.42. For the classification of aircraft 
electrician he used Bennett's Test of Mechanical Comprehension 
and a test designed to measure three-dimensional visualization. 

Cotton-mill-machine Fixers. — Using an adaptation of die 
Stenquist Mechanical Assembly Test, Harrell ‘ had forty-five 

^ Jaoobun, Eukin E. An evaluation of certain tests in predicting mGchanio 
learner achievement. Eduo. Psychol, Meas,, 1943, 3, 269-267. 

* HaasBix, WiLUnn. The validity of certain mechanical ability tests for 
selecting cotton mill machine fixers. J. aoe. Psychol,, 1037, 8, 279-282. 



PRINCIPLES OF PERSONNEL TESTING 


148 

loom fixers in a cotton mill rated by their overseers and obtained 
a correlation of 0.45 between scores and ratings. In another 
study he had the carding department overseer rate “according to 
mechanical ability” forty men who had been fixers and obtained 
a correlation of 0.84 between scores and ratings. A similar in- 
vestigation with ten spring-frame fixers gave a correlation of 
0.78. 


OTHER TRADE GROUPS 


Printing Pressmen Apprentices. — ^The Minnesota Paper 
Formboard was used by Hall * for identifying superior printing 


Score of 45 or more 
80% 

Score of 44 
or les8 

Rated Inferior 


10 %^ 


64% 




15 % 

Rotod overogo or 
auportor 


Fia* 0-21,— Proportion of printing prcsstneti apprentices sroring abovo and be- 
low a cniical score who weic mtrcl inferior (shaded bnra), average (open bars), 
and superior (black bam). (From JIaU.) 


pressmen apprentices. Eighty-six men were rated by three dif- 
ferent instructors, and the resulting composites were divided into 
inferior, average, and superior. A critical score of 46 on the test 
was established, and Fig. 9-21 presents the results. Of those 
above the critical score only 10 per cent were considered inferior, 
and of those below tlio critical score 80 per cent were considered 
inferior. Of the first group 26 per cent were superior, and of the 
latter group only 5 per cent were so considered. 

Miscellaneous Shop Workers. — ^Earle * reports the use of the 
Stenguist Mechanical Assembly Test with trainees in a number 
of shop areas. Using instructors* estimates of proficiency as a 
criterion, he obtained correlations as follows: fitters, 0.49; car- 
penters, 0.82; blacksmiths, 0.37; and electricians, 0.07. None of 
the groups included more than twenty-one men. 

* Hau/, Mn/ioH 0. An aid to the aelection of pressmen apprentiooe. Person. 
J., 1030, 0, 77-Sl. 

*EAnui, P. M. Teaia of mechameal abiUti/. Studioe in Vocational Guidance, 
Report No. 3. London: National Instiluto of ludualrial Feyoholosy, 1020 . 43 pp. 



TESTS FOR MECHANICAL AND OTHER MANVAL WORKERS 

Lawshe ^ d6inon8trat6d the usefuln^s of the PurduG Industrial 
Training Classification Test in identifying successful miscel- 
laneous shop trainees in three different rituations. The 26 per 
cent rated highest in every instance mode significantly higher 
average scores on the test than did the 25 per cent rated lowest. 

Mixed Apprentice Groups.— Two studies* have been con- 
ducted with the Purdue Mechanical Adaptability Test which deal 
with mixed apprentice groups. Although both of these groups 
include machinist’s apprentices and consequently might have 



Fig. 9-22/^Proportion of apprentices who were rated high when various criti- 
oal scores on the Purdue Mechanical Adaplability Teel were considered* (From 
Lawsho, Semanek, and Tiffin,) 

been discussed in an earlier section, consideration has been post- 
poned until this point because combinations of occupational 
classifications are involved. One group of twelve in a steel mill 
included the following: four machine shop, two masonry, and one 
each of electrical, maintenance, carpenter shop, blacksmith, pipe 
shop, and electrical construction. A low but positive correlation 
of 0.89 between ratings and scores was found. 

The other apprentice study was conducted in an electrical 
manufacturing plant with a group that included machinists, tool- 
makers, diemakers, foundrymen, and miscellaneous electrical 

^ Lawshb, O. Hj Jr* The Purdue Industrial TrainiuB dassifioation Teat. 
J, apvl Paychol, 1042,26, 770-776. 

> Lawshb, SiudiWifiK, Am Tufrik, op. mL 



150 PBINCIPLBS OP PERSONNEL TESTING 

workers. These men were rated by their supervisors, and Fig. 
9-22 shows the results after the ratings were corrected for dif- 
ferences among judges. The apprentices were divided into the 
lowest rated 47 per cent and the highest rated 53 per cent, the 
latter group being called “high.” As successively higher critical 
scores are applied, the percentage of high rated apprentices sys- 
tematically increases. 


SUMMARY 

The term mechanical aptitude has little meaning, since it is 
employed in such a wide variety of situations. It is quite likely 
that there are three component of mechanical ability: the ca- 
pacity to understand mechanical relationships and processes, 
manipulative ability, and a cluster of such motor abilities as 
strength, speed of movement, and endurance. Mechanical ability 
tests may be classified as paper-and-penoil vs. apparatus tests, or 
they may be classified by function into one of the following: in- 
formation, spatial relations, mechanical comprehension, or in- 
terest. 



CHAiPVER X 


TESTS FOR CLERICAL AND OTHER 
OFFICE EMPLOYEES 


Clerical tests are among the most widely used of all tests in 
business and industry. Some of tho reasons are obvious: (1) 
Clerical tests are usually of the trade test variety, frequently em- 
bodying a job sample, and have high “face validity”; (2) results 
are frequently more obvious in the office than in the plant, par- 
ticularly when no validity studies are undertaken; and (3) office 
managers are frequently more easily “sold” on testing programs 
than are plant superintendents and industrial supervisors. In 
spite of the wide use of such tests, however, the literature is more 
void of actual validity studies than is the case with studies in- 
volving plant-operating jobs. No doubt this fact is associated 
with tho first reason given above; they so closely resemble the 
real thing that it hardly seems necessary to many users to conduct 
validity studies. 

Nature of Office Work. — ^Like job classifications in tlie plant, 
classifications in the office vary conriderably from company to 
company. Generally, however, there are four types of workers: 
clerks, engaged in filing, checking, tabulating, and miscellaneous 
office tasks; typists, engaged primarily in typing but usually per- 
forming some function performed by clerks; stenographers, 
usually employing shorthand and typing in their work; and mar- 
chine operators who may use any kind of office equipment from 
the duplicator or dictation transcriber to machine bookkeeping 
and posting equipment. Only in very large offices is there a clear 
demarcation among these various types of jobs, and one fre- 
quently finds such job titles as clerk-secretary and receptionist- 
typist. Also the extent to which the employee engages in true 
secretarial activity merits attention. Some employees are given 

in 




162 PRINCIPLES OF PERSONNEL TESTING 

considerable responsibilily and exercise ingenuity and independ- 
ent judgment; others perform tasks just as routine as does any 
line employee in the plant. These facts should demonstrate to 
the employment man the necessity for validity studies on office 
jobs. 

Criteria in Office Jobs. — ^One comment is frequently offered 
when the subject of validity in connection with office jobs is dis- 
cussed: “No two girls do the same thing.'* This is often true. 
The Committee on Tests of the Life Office Management Associa- 
tion ^ has offered the following list of possible criteria: 

1. Salary. (This, of course, has certain limiting factors in- 
cluding economic conditions, unequal opportunities from 
department to department, and the fact that salary may be 
related to length of service and not to adequacy.) 

2. Job level or grade in the salary evaluation or classification 
plan. 

3. Quality and quantity of output (in such areas as card 
punching, ediphone or dictaphone transcription, and the 
typing of standard forms or reports). 

4. Training time (applicable only when an employee may 
progress at her own rate). 

fi. Promotability measured in terms of job level attained 
within a specified period of time. 

6. Ratings of supervisors or others. 

OFFICE CLERKS 

Routine Clerical Work. — ^Roberts and Ostermick ‘ conducted 
a study involving twenty-two clerks in which they used the Won- 
derlic Personnel Test in combination with the Minnesota Vocor 
tional Test for Clerical Workers and by using a critical score of 
15 on the former and 205 on the latter were able to show sig- 

^ The appUcalion of ptychologieal tetts to Oto selection, jjlaeement, and 
transfer of elorical employees. Life Office Managomciil: Aeaooiatioii, Committoe 
on Tests. Report No. 6, New York, 1042. 28 pp, (Frooessed.) 

> Roberts, Wiluaii H., and Obtbbmick, Ralph E. Test aeorea and ratinga 
of Ditto machine operators. Milwaukee, Wie.: Allis Chalmers Mfg. Co., 1046. 
(MimeoO 



tests for clerical and other office employees 163 

nificaiit relationships with supervisory ratings. Figure 10-1 
shows their results. Note that 100 per cent of those rated above 
average passed both critical scores whereas only 20 per cent of 
those rated below average exceeded both critical scores. They 
report correlation coefficients with supervisory ratings of 0.54 
with the Wonderlic and 0.53 with the Minnesota. 

Mental Ability and Clerical Jobs. — As in the above case, men- 
tal ability tests quite frequenUy "come through” in various kinds 


Rated obove average 
Rated average 



80% 




Rated below 
overoge 

Percentoge foiling to attain 
both critical ecoree 




■eo« 

Percentoge attaining both 
critical eooree 


Fig. 10-l.--Froportion of clerks in threo rated ability groups yrho attained and 
failed to attain oritioal score in a test battery. {From Roberts and Oatermick.) 


of office and clerical jobs. Since mental ability is discussed at 
length in Chap. V, one study is sufficient at this point (see par- 
ticularly Fig. 6-9). Pond and Bills* report a study involving 
286 men engaged in clerical and related occupations. The Sco- 
vUle Mental Ability Test was used, and their results are presented 
in Fig. 10-2. As indicated in the figure, of those scoring 170 or 
above 00 per cent were considered satisfactory and 10 per cent 
unsatisfactory. Note that the percentage considered satisfactory 
gradually decreases with lower score brackets. 

Test Batteries. — ^Further insight into the nature of clerical 
ability is provided by two studies, each involving a number of 
different tests. Armstrong * administered a battery of six tests 
to a group of ofi^ce and clerical employees Included were tests 
designed to measure memory and attention, arithmetic, correct 
copy work, checking errors, filing, and common sense and reason- 

1 Pond, Milucbnt, and Bilus, Marion A. Intelligeuce and clcricivl jobs; two 
studies of relation of test score to job hold. Person, J,, 1033, 12, 41-66. 

s Arhstbono, T. 0. New methods in promotion and hiring. Person. J., 1036, 
15, 280-283. 



164 PUmaiPLES OF PERSONNEL TESTINQ 

ing. He obtained a correlation of 0.80 with the ratings of de- 
partmental managers. 

As a part of l^e Minnesota studies, Dvorak ^ presents group 
profiles of a group of poor female ofiice clerks and a group of good 
clerks. Every one of the following tests discriminated in favor 
of the good employees: Pressey’s Senior Classification Test and 
Senior Verification Test, Minnesota Vocational Test jor Clerical 
Workers, O’Connor’s Finger Dexterity Test and Tweezer Dex- 


ThoM scoring 
170-209 
Thoaa BOoHng 
160-169 
Thoia scoring 
(10-159 

Proportion considerad 
UDsaticfoctory 



90% 


Proportion considered 
sotlsfoctory 


Fxa. lO-2^Froportion of three groups of clerks classified according to tost 
BCoicB who were considered satisfactory imd unsatisfactory, (Frojn Pond and 
BUb.) 


terity Test, Minnesota Rate of Manipulation Test, Minnesota 
Mechanical Assembly Test (Box A), and the Minnesota Spatial 
Relations Test. 


TYPISTS 

Office Typists. — Cook* reported one of the few published 
validity studies involving typists. Using the follow-up method, 
he administered a speed and accuracy typing test to 190 typists 
at the time of employment. After the first six months on the 
job, a comparison was made between test scores and job per- 
formance. Using a “67 per cent bogey” to indicate that level of 
job performance below which a girl could not fall and still be 
called satisfactory, he obtained significant results which are 
shown in Fig. 10-3. Of those girls who type forty words per 

^ DvobaXi BeAtricb, Differential occupational ability pattern, Univ. Minn, 
Bull Bmpk/t, scab. Bes, InsL, Vol, 111, No, 8, Univorsity of Minnesota Press, 
1934. 

> OooK, D. W, Some practical results from tests. Proa. Personnel Seleoiion 
Clinic. Kansas City, Mo.: Greater Kansas City Committee for Boonomio De- 
velopment, 1946, 



TESTS FOR CLERICAL AND OTHER OFFICE EMPLOYEES 16S 

minute or better on the test at the time of employment 100 per 
cent made the 67 per cent bogey. Of tiiose who typed fewer 
than forty words per minute on the test only 45 per cent made 
the bogey and 65 per cent did not. 

Typing Tests. — ^There are a number of typing tests on the 
market. Among the most widely used are the Thurstons Exam- 
ination in Typing and another by Blackstone. One of the newer 
tests is Kimberly-Clark Typing AbiUty Analysis, in one report ^ 


40 word* or botlt r on tait 


Fawar thon 

40 word* I 


100 % 


Proportion failing to | Proportion moktng 
moko 67% boQoy 67% bogty 


Fia. 10-3^Froport!on of typists claasificd according to test score who attained 
or failed to attain a certain job-proficiency level. (From CooD 


on which are given validity correlations in the 90 ’b between scores 
and estimates of adequacy of typists in some paper mills. 


STBNOGEAPHERS 

Mental Ability. — ^Roberts and Ostermick * report a study in- 
volving sixteen stenographers in which they used only Wonder- 
lie's Personnel Teat, A critical score of 20 on the test differen- 
tiated between high and low rated employees as is shown in Fig. 
10-4. Note that of those rated "above average” 100 per cent 
passed the critical score and that of those rated "below average” 
100 per cent failed the critical score. 

Transcription Tests. — ^Much testing of shorthand skill is done 
in a hit-or-miss fashion. Frequently standardized tests are not 
used; and even when the transcription section of the Blackstone 
Stenographic Proficiency Test is used> considerable skill is needed 
by the examiner to ensure proper timing. The Seashore and 
Bennett Stenographic Proficiency Tests which are available on 
phonograph records seem to be the answer. Five letters of vary- 

^ JuBOBNBiN, CLmoHD 13. A test for selooting and truning iaduBtrial typists. 
Edve. Ptychei. Meat,, 1042, S, 400-426. 

> Hobbktb and Obtbbmick, op. cii. 



166 PRINCIPLES OF PERSONNEL TESTING 

ing lengtlis and dictated at five different speeds are included in 
each of two forms. 


RaUd abovi Qv«roge 



Rolad ovarags 45% 
100 % 

Rotad 
balcMN ova 








Percentaga foiling to 
of tain critical acore 


55% 


Parcantoga attolning 
critical score 


Fia, KM^Proportion of three rated ability gi'oups of Btenographera who at- 
tained and failed to attain a critical score on the Personnel Test. (From Rob-* 
erts and Ostermich.) 


ro above 87 


|85% 




56% 


Names cheoklng ioore 
over 130 




Number checking score 
over 130 


|96% 


W///////AyM 


60% 


PiQ, 10-5.— Proportion of better group (solid bars) and poorer group (shaded 
bars) of machine bookkeepers who attained critical scores on three different tests, 
(Prom Hai/.> 


OFFICE ECACHINE OPERATORS 

Machine Bookkeepers. — Hay^ reported a study involving 
forty machine bookkeepers. He used a mental ability test and 
both the names and number sections of the Minnesota Voca- 
tianal Test for Clerical Workers. When the bookkeepers were 
divided into the best twenty and the poorest, the results pre- 
sented in Pig. 10-5 were obtained. When a critical score of 88 
(LQ.) on the mental ability test was used, 85 per cent of the 
better group and 55 per cent of the poorer group exceeded it. 

* Ha.t, Edwasd N. Tests in industry. Person. J,, 1941, 20, 8-16. 


TESTS FOR CLERICAL AND OTHER OFFICE EMPLOYEES 167 

More marked differences were obtained witli the nameB and num- 
bers checking tests os is shown in the figure. 

Ditto-machine Operators. — A study by Roberts and Oster- 
mick/ though involving only twelve ditto-machine operators, 
should be cited. Using the same critical scores indicated above 
in their study of clerks, they found that Wonderlic’s Personnel 
Teat and the Minnesota Vocational Test for Clerical Workers 
showed a significant relationship with rated success. Figure 10-6 


Rqted obove overaga 


100 % 


Rofad overagi 

100% 






Rated 
below averoge 




Proportion falling to I 
obtain both crltlcol soom 


|e(r% 


Proportion ottaintng 
both orltlool loone 


Fxg. 10-6.— Proportion of three mtod ability groups of ditto-maohine operators 
who nttnined or failed to attain critical scores on two tests. (/*Vo 7 ii RoberU and 
Ostermick,) 


shows that whereas 100 per cent of those rated above average 
attained boffi critical scores, 100 per cent of those rated bdow 
average failed to attain both critical scores. 

Miscellaneous Machine Operators. — Ghiselli* through the 
United States Employment Service used the Minnesota Vocof 
Uondl Test for Clerical Workers in connection with a number of 
different clerical job classifications, five of which involved ma- 
chine operation. The number of people ranged from 26 adding- 
machine operators to 121 card-punch-maohine operators. Figure 
10-7 shows his results. Of those who were above average on the 
criterion the respective proportions that were above and below 
average on the tests are presented. Although the degree of dis- 
crimination is slight in all jobs, the consistency of the pattern in- 
dicates i^e almost universal presence in machine clerical jobs of 
the trait or skill measured by the test. 

1 RoBBRifi AND Obtbrhigr, op. eil. 

* GHtSBLi.!, Edwin E. A compariaon of the Minneaota Vocational Teat for 
Clerical Workeia with the general clerical battery of the United Statoa Employ- 
ment Service. J. appL PaytAol., 1942, SO, 76-00. 



PRINCIPLES OF PERSONNEL TESTING 


158 

Gottsdanker ^ worked with forty-four female learners engaged 
in the operation of crank-driven calculators. He used a number 
dot-location test which he referred to as a paper keyboard test 
and obtained a correlation of 0.49 with his criterion. An arith- 


Adding machine operators 
Bookkeeping machine opec 
CalGulolIng machine opar 




32% 



3T% 






Card-punch machine opnn 39%! 




■ 66 % 
52 % 
■63% 
■61% 


Calculator (key octuated) opec 



63% 


Proportion below | Proportion above 
overage on test average on test 


Fio, 10-7.— Proportion of those judged above average in five job clnsaificationa 
who were above average on the Minnesota Vocational Teat for Clerical Workera, 
(From GhiaeUi.) 


metic computation test and a number-comparison test yielded 
correlations of 0.36 and 0.29 respectively. 


JOB AND FACTOR ANALYSIS 

Primaiy Mental Abilities. — The concept of primary mental 
abilities was discussed in Chap. V. An application of this con- 
cept to clerical work has been made by the Committee on Tests 
of the Life Office Management Association,^ Three separate 
tests were designed, paralleling three of Thurstone’s primary 
mental abilities, N (numerical facility), V (verbal facility), and 
M (memoiy). The first test (N) was administered to 113 clerical 
employees who were engaged in checking and posting operations. 
The results of the study are presented in Fig. 10-8. The range 
of scores was divided into three brackets, and the proportion of 
those scoring in each bracket who were considered above average, 
average, and below average in job performance is diown. Where- 

^ OoTTSDANXEB, RomsHT M. MeBSures of potentiality for machine calculation. 
J. ttppl. Psvokal^ 1043, 27, 233-248. 

3 The application of ptt/choloffical ieela to the seleoiion, plaeomenl, and trano- 
fcT of deriad employees, Idfo OiHco Management Association, Gommitteo on 
Teste, Report No. 6. New York, 1042. 28 pp. (Processed.) 



TESTS FOR CLERICAL AND OTHER OFFICE EMPLOYEES 159 

as 36 per cent of the high scorei's were above average, none of 
the low scorers were above average. 

The verbal facility test was administered to thirty-three cleri- 
cal employees engaged hi discussion and correspondence, and 


Thoie scoring ISO end 
obove 

Those scoring 
90 *-149 
Those scoring 
89ond below 


18% 


46% 


40 %^^^ 


44% 




14% 



Proporfion overoge and 
below overoge 


Proportion 
obove ove« 


Fio. 10-8.— Proportion of checking and poabing clorka classified according to 
test scores who were considered below nverage (shaded barB)j average (open 
bar8)j and above average (solid bars) on the job. (From the Life Office Manage^ 
ment Asaociaiton*) 


Those scoring tl3 
or better 

Those scoring 
70- M2 


Those ecoring 
77 or lees 


I7%i 

2s%mr 

40% 

27 % ^^ — 

86 % 





|33« 






Proportion overoge ond 
below overage 


Proportion 
obove overoge 


Fio« 10-9^ — ^Proportion of clerical employees engaged in verbal activities olaasi- 
lied according to teat score who wero considered below average (sliaded baTs)^ 
average (open bars), and above average (solid box's) on tho job. (From the Life 
Office Management Asaociaiion,) 


similai* results are presented in fig. 10-0. Note that the pro- 
portion of above-average employees increases as the score in- 
creases and that the number of bclow-averago employees 
increases as tibe score decreases. Figure 10-10 tells a similar 
story regarding Test M which was administered to employees 
engaged in typing and machine operation. Althou^ the vadues 
vary somewhat, the pattern is identical. 

Job Classification. — ^Another test of the Life Office Manage- 
ment Association identified only as Test I A (apparently a general 
mental ability test) has been used in a unique fashion that 
demonstrates an interesting relationship between human abilities 
as defined and measured by tests, and a clerical job classification 



PRINCIPLES OP PERSONNEL TESTING 


160 

system. All clerical positions were classified into four grades as 
indicated below. 

Grade I — Simple clerical work 
Grade II — Complicated clerical work 
Grade 112 — More complicated clerical work 
Gi'ade III — ^Decision-making jobs 


An analysis of the test scores of 200 clerical employees who had 
been on the job from five to ten years was made with regard to 


Thoi« scoring 20 ond obove 


11 %^ 


33 % 


66 % 


Those scoring 
12- Id 

Thou scoring 

II or less 


50%| 


W/AV/Am 


64%| 




27% 


|26% 


ProporHon overage and 
below overage 



9% 

Proportion 
above overage 


Fig. 10-10.--Proportion of typiata and machine operators olaasificd according 
to tost score who were conaidorcd above average (solid bars), average (open 
bars), and below average (shaded bars) on the job. (From the Life Office Man- 
agement Association,) 


the job level or grade attained in that period of time. The 
results are presented in Fig. 10-11. Approximately fifty of the 
employees were holding jobs of eaoh level, but the figure shows 
the percentage shift as higher score brackets are considered. For 
example, of those making test scores of 00 or better 27 per cent 
were holding Grade III jobs. However, if only tliose scoring 150 
or better are considered, 52 per cent were holding Grade III jobs 
and the percentage continues to increase with higher critical 
scores until, when a minimum of 190 is considered, 85 per cent 
have attained Grade III jobs. Similar curves are presented for 
Grade II and higher jobs and for Grade II2 and higher jobs. 
Thus the curves can be interpreted as follows: If a critical score 
of 150 were imposed, within a five- to ten-year period, about 95 
per cent would attain a Grade 11 or higher job, 85 per cent would 
attain a Grade II2 or III job, and 52 per cent would attain a 
Grade HI job. The hiring implications where a promotional 
sequence is involved seem dear. Hiring specifications should be 



TESTS FOB CLERICAL AND OTHER OFFICE EMPLOYEES 161 

establi^ed not only in the light of the demands of present jobs 
but also with a thought to the demands of the promotional se- 
quence and procedure. 



Fio. lO**!!/— Proportion of clerical employees attaining or exceeding spedfio 
tost scores who attained various clcncal job grades in five to ten years on the 
job. {From the Ltje Office Mamgcmeni AsBOoiation,) 

SUMMARY 

Clericid tests are among the most widely used and least fre- 
quently validated of all personnel tests. Generally, they are 
useful in identifying successful or potentially successful em- 
ployees in four general classifications: clerks, typists, stenog- 
raphers, and business-machine operators. Test validation in 
this area is just as essential as in any oUier; and although cri- 
teria are not easily found, some measure of job success is usually 
available. Evidence regarding the possibility of correlating job 
analysis and factor analysis in the clerical and office-work field 
has been presented. 


CHAPTSB XI 

TESTS EOR SALESMEN AND RETAIL- 
STORE EMPLOYEES 


In starting a discussion of tests for the selection of salesmen, 
it is appropriate to quote Sbai'tlc’s statement^ that **we have 
studied something like 20,000 occupations and we find that the 
difficulty encountered in devising improved selection techniques 
for sales work is probably not equaled in any other group of 
occupations.’* The differences among jobs all of which fall in the 
“selling” bracket are extensive, and many authors and experi- 
menters including Rosenstein * have indicated the impracticality 
of trying to make generalizations drawn in one field of sales ac- 
tivity applicable to the selection of salesmen in another field. 
This, of course, is only a restatement of the whole point of view 
behind this book, namely, that validation for every job is a 
must. 


INSURANCE SALESMEN 

Life-insurance Salesmen. — More published work has appeared 
in the insurance field than in any other sales area, and the ma- 
jority of this work has pertained to life-insurance salesmen. 

Generally speaking, tests of the temperament and interest 
variety have been the most successful. The work of Strong * and 
others cited in Chap. VII is pertinent. Schultz * has used the 
temperament approach with some success as was indicated in 

^ SiiARTLiDi C» L. The measureroent and eelcction of salesmen. Afffmi, Jiev,, 
1944, as, 9^96, 

» Rosbnstbin, J. L. The ecionlifio selecUon of Batesmen. New York: 
MeGraw-Hill, 1944. 102 pp. 

» SrHONfl, Edward K., Jr, Vocalional intBresls of men a«d tsoman. Stanford 
University, Calif.: Stanford University Press, 1043. 

* ScuTJi/rz, Richard S. Standardised tcstii and statistical procedures in selee^ 
tion of hfo insuraacc salos personnel. /. appl Psychol, 1036, 20, 663-^. 

162 



TESTS FOB SALESMEN AND BETAIL-STOBE EMPLOYEES 163 

Chap. VI. Using Beckman’s Revision of the A-^ Test and the 
Root’s Introversion-extroversion Test he was able to discriminate 
with some degree of accuracy between the best and poorest pro- 
ducers among 259 life-insurance salesmen. Using as acceptable 
scores that range from the twentieth to ninetieth percentile on. 
each test he obtained the results presented in Fig. 11-1. Of 


20 % 


60 % 

Low 20% 


32% 
38 %“ 
63% 


mm 


T 




I 




Unocceptobit 

scores 


168% 
|ee% 

|47% 


Accspfabit 

seorsB 


Fio. ll-l^Proporfcion of salesmen at three difiTerent production levels who 
made acceptable and unacceptable teat scores. (From Schultz.) 


those who were considered the best 20 per cent in production 68 
per cent were in the acceptable score range, whereas of those who 
were considered the poorest 20 per cent in production 47 per cent 
had acceptable scores. 

Steward’s^ system for selecting life-insurance salesmen has 
been widdy recognized. His present system ‘ for selecting life 
underwriters consists of "a composite inventory and examina- 
tion.” His statement of its contents follows: 

1. Short form of the Otis Self-administerinff Test of Mentsd 
Ahiliiy 

2. A modified form of Bernreuter’s Personality Inventory to 
measure (a) dominance, agressiveness, and initiative and 
(6) stability 

3. A general knowledge examination consisting of thirty-two 
items in the following fields: (a) economics and finance, 
(b) bu^ess arithmetic, (c) bunness law, and (d) home 
and social problems 

4. A Vocational Interest in Selling Inventory 

^ StrwarDi Vehnb. The and value of tpcdal tats in the aelection of Ufe 
vndemniera. 1116 East Eighth St.« Lob Angelos^ Calif., 1034. 03 pp. 

^ &SBWAUD, Vernr. The dovelopmcxLt of a selectioa Byatcm, for paUBmen, 
Pmonnel, 1040 , 17 , 124 - 130 . 



PRINCIPLES OP PERSONNEL TESTINO 


164 

6. A new Personality Trait Illustrations Inventory 

6. A personal history section 

7. A rating form 

By weighting these various in8truments> Steward has been able 
to predict soles production with a high degree of success. In a 
study ' involving ten large metropolitan agencies he was able to 
demonstrate the relationship shown in Fig. 11'2. Those under- 


er-too 
ri- 80 

a 61- 70 
£ 51-60 

g 41-50 
31 MO 
D - 30 

Fia. U*2^Ayorago volumo produotioii of Jifo imdcrwritoiiB scoring at viurioug 
levels on Stewaid’s system. (From Steward,) 

writers who received between 81 and 100 points averaged sales 
totaling $6,176 for the period studied; those scoring between 71 
and 80 sold an average of $4,648, There is a consistent decline in 
average sales with lower point scores, and those who score 30 or 
less average only $229 for the same period of time. 

The work done by the Life Insurance Sales Research Bureau 
and reported by Kurtz * is perhaps the most comprehensive of all 
the work in this field. Extensive research, a discussion of which 
is beyond the scope of this treatment, has resulted in the develop- 
ment of the Aptitude Index to measure aptitude for life-insur- 
ance selling. The index consists of a personal-history rating 
form which incorporates some of the questions usually asked on 
on application blank and a personality-characteristics section 

^ Stbward, Vbrnb, AruAvtit of aalea petaomel problem. Los Angeles: Veme 
St^eRl & Associates, 1913. 

* SvBTZ, AuenT K. Booeat rescaroh is the seleetum of Ilfs insuntnoe soles' 
men. J. apjA. Pavehal., 1911, SS, 11-17, 


6,i76 

t— M^ M.64e 
2.685 

^^^$2,278 

^^ $ 1.308 

^$733 

@$229 




TESTS FOR SALESMEN AND RETAIL-STORE EMPLOYEES 166 

which is composed of a number of temperament and interest 
types of questions. The Aptitude /nde® for life-insurance sales- 
men has been validated against both sales records and agent- 
survival records, since many people who talce up insurance sell- 
ing are not sufficiently successful to remain in the business. 

Figure H-3 from the study by Kurtz * shows the production of 
211 men who were twenty-six years of age or older. The figure 
is interpreted as follows: Those with A ratings who survived 




206 % 


c 

S B 

0 

1 C 
8 

S 0 




I 


mmm 


39% 

'4r% 


All 




iroo% 

Fia. 11-3^ — Percentage of average production of n Kro\ip of 211 life-insurance 
salesmen dossed according to Aptilvda Index scores, (^roni Kvxiz^ 


sold 206 per cent as much insurance as a group selected at ran- 
dom from the group. Additional research with the Aptitude 
Index continues to aliow significant results. On company* has 
reported the facts presented in Fig. 11-4 which is based upon the 
records of 177 agents hired by one company over a one-year 
period. The length of the bars indicates the relative number of 
men in each rating class who are required to produce a given 
volume of sales. For example, according to the company’s 
records, six A men will produce as much in a given time period as 
will eighty-four men with E ratings. This company no longer 
hires E men; and in 1942, 92 per cent of the men they hired had 
A or B ratings on the Aptitude Index. 

Casualty-insurance Salesmen. — Bills’ study,® the results of 
which are presented in Fig. 7-6 and discussed in Chap. VII, pre- 

1 mi. 

* The value and use of the apiiiude index. Harvard, Oonn.: Life lasuranco 
Sales Rcseaoh Bureau, 1946. 

> BillBi Marion A Relation of soorca in Strong's iiitei’est analysis blanks to 
BU00C8S in Belling casualty insurance. /. pByohoU, 1038, 22> 97-104. 


166 PRINCIPLES OP PERSONNEL TESTING 

Bents evidence that a similar degree of success is possible in the 
casualty-insurance field. Ghiselli ^ in one study involving 
twenty-nine casualty-insurance salesmen used a combined rat- 
ing and production index as a criterion and obtained correlations 
of 0.38 with the CPA score on the Strong Vocational Inventory 
Interest BlanJe for Men and 0.32 with the Pressey’s Senior Classi- 
fication Test, These and other studies suggest tliat predictions 





Fig. — ^Number of aalcamcn having each Aptitvds Index grade that ia re* 

Quired to produce a given volume of busmcaa. {From Life Imurance &ales i2c- 
nearch Bureau.) 

can be just as successful in the casualty field os in the life- 
insurance field. 


MISCELLANEOUS SALESMEN 

Laundry-supply Salesmen. — Otis ’ has reported the results of 
a study involving seventeen men engaged in selling soaps and 
alk^ies to laundries. Using gross sales as his criterion, he ob- 
tained a correlation of 0.50 with combined life-insurance-sales- 
men scores and real-estale-salesmen scores on tlie Strong Voca- 
tional Interest Blank for Men. He further found correlations 
of 0.31 between personal data scores and s^ing cost and of 0.37 
between personal data scores and gross sales. 

OlEce Equipment. — ^Todd • reporting on two years of experi- 
ence with an unidentified battery of tests for salesmen of office 
equipment presents the values shown in Fig. 11-5. Of those 

^ Ghis^uj, EiywiK E. The uso of the Strong vocational interest blank ond 
the Pressey senior classifLoation tost in the scleotion of casualty insurance agents. 
J, ajjpL Psychol, 1042, 26» 70&-799. 

^ Oris, Jay L. Proccduros for tho solcotion of salesmen for a detergent com- 
pany. J. appl Psychol, 1941, 25> 30h10. 

* Todd, GfiionaD L. Aptitude testing 600 salesmen. Industr. Marketing, Sep- 
tember, 1044, 29, 3^-33, ISS, 140. 


TESTS FOB SALESMEN AND BETAIL-STOBE EMPLOYEES 167 

meeting a certain standard on the battery 35 per cent were con- 
sidered good and 15 per cent failures; of those not meeting the 
test standard 8 per cent were good and 25 per cent were failures. 

Machine Accotmting Methods Salesmen. — ^The study of Ryan 
and Johnson^ which was reported in Chap. VII is of interest 
here. Figure 7-7 shows signiScant results in the use of the Strong 
Vocational Interest BUmk for Men. 


Rdcommended by test 
Not recommended 


15% 50% 



135% 


Per cent conBidered |Per cent considered 
overoge or failure ' good 


Fio. 11^.— Proportion of office equipment fialeamcn classified on the baeia of a 
test battery who were considered good (solid bars)^ average (open bars)i and fail^ 
urea (shaded bars). (h\om Todd*) 


Fersonalily Tests. — ^The discu^ion of temperament tests in 
Chap. VI is pertinent with reference to salesmen. This is par- 
ticularly true of Jurgensen’s Classification Inventory * which is 
intended to be validated for each job and which promises to be 
quite useful with sales group 84 


SETAIIrSIORS PERSONNEL 

Store Salespersons. — ^Uting a modified scoring of “social 
dominance” on the Bemreuter Personality Inventory, Dodge* 
found that a critical score of 36 had significant discrimination 
value with store salespeople. A secondary group of male sales- 
people were rated as average, above average, and below average, 
and the proportion of each rated group falling above and below 
the critical score is shown in Fig. 11-6. The cut-off successfully 
identified 88 per cent of the above-average group and 75 per cent 
of the below-average group. Figure 11-7 shows similar results 

^ Ktak, T. a., and Johnson, Boatricb E. Interest aoores in the selootion of 
soleamcn and servicemen; Occupational va. ability-group scoring keys. J* appL 
Psychol, 1942, 20, 643^662. 

a JuROiONBBN, Gliffow) E. Eeport on the clnasification Inventory, a peraonality 
test for industrial use. J, appl. Psifohal*, 1944, 28, 445-460. 

A Dodob, Abthur F. Social dominance and sales personality. /, appl, 
PsychoU 1038 , 22 > 132 - 180 . 



PBINOIPLBS OP PSR80NNEL TEBTING 


168 

for a group of eighteen female salespeople. The results, though 
not so pronounced as those of the men, are nevertheless positive. 

Store Cashiers. — ^This section dealing with cashiers and the 
following one, though not involving salespeople, ai'e included in 
this chapter for wont of a more logical position in the text. 
Clarke^ using the Otis Self-administering Test of Mental 


Above overage 
Averogfi 
Berow avarago 



Fra. 11-6— Proportion of rated Ability groups of store salespeople (malcj sat- 
isfying and failing to satisfy oriUcal score on test {From Dodge.) 


|S0% 

|43% 

Exceeding critical 
ecore 

Fra. 11-7 ^Proportion of rated ability groups of store salespeople (female) 
satisfying and fniliiig to satisfy critical score on test. (From Dodge.) 


Abm average 


m 



1 

Average 

^EEAw///mm 



80% 


Below woroge 

mmmw///////A 


Below oritlcat 
ecore 



Ability, a change-making test, and a dexterity test obtained a 
correlation of 0.60 between predicted production and actual pro- 
duction. 

Viteles,* working with sixty-eight department-store cashiers 
and twenty-two inspector-wrappers, devised a test based upon 
job analysis. As described, Ibe test included measures of the 
following; ability to follow directions, accuracy, arithmetic 
ability, common-sense judgment, and language ability. Viteles 
noted a distinct tendency for those who scored within the 30 to 
70 bracket to have longer tenure. As shown in Fig. 11-8, nine 
people with scores less than 30 had an average tenure of 43 days; 

^ CuKKR, Wai/trr y. Tito evaluation of employment tc.il8. Pononnel, 1037, 
14, 133-130. 

* ViTRLisa, Moitme S. SalocliiiK caahiera and predicting length of service, f. 
Person, Bet., 1024, 2, 467-473. 




TESTS FOR SALESMEN AND RETAIlrSTORE EMPLOYEES 169 

sixty^seven scoring in the 30 to 60 range had aa average tenure 
of 140 days, and fourteen scoring 70 and higher had an average 
tenure of 61 days. Viteles further notes that “although the test 
distinguislies between competent and incompetent applicants, 
skill in handling the job does not increase proportionately with 
increased scores above the minimum passing score.” 


ThOM scoring 
10-29 

Those scoring 
30-69 

Those scoring 
TO and above 


V/fmm/. 


43 doys 




140 days 


mmmm 


61 days 


Fia* ll-8d— Average job tenure of oashiera and wrappers who scored at vatioub 
levels on a test battery. (From Viteles.) 


g34"or Um 
More than 234*' 



47% 


7% 

Producing 95 
units or more 


Fig. ll-9r— Xtelationship between test scores of departmentr^tore packers nnd 
production performnncc. {From Blum and Candee.) 


Packers. — ^In a study involving sixty-five packers in a depart- 
ment store, Blum and Candee' used the Minnesota Rate of 
Manipulation Test against actual production scores and obtained 
the results presented in Fig. 11-9. Using 234 seconds os a ci’itical 
score on the placing test, they demonstrated tha^ of those per- 
forming better than that 47 per cent exceeded 96 production score 
units whereas of those doing poorer only 7 per cent exceeded this 
production standard. They further point out that this difference 
tends to disappear with experience. 


SUMUARY 

The fact that there appears to be no universal general sales 
typo is a significant one in the personnel testing field. While it 

^ BliUM, Miuton L.| and CandeuOj Bbatbicb. The selection of department 
store packers and wrappers with the aid of certain psycliological testa. J. appl. 
Psychol., im, 25, 76-66, 201-290. 


170 PRINCIPLES OF PERSONNEL TESTING 

is quite likely that tliere are job families amoug sales jobs as 
there are in other occupational areas, the practice of validating 
jobs on each sales situation is just as essential as is the applica- 
tion of the principle elsewhere. Mental ability tests may or may 
not be useful in a particular sales battery, and tests of the tem- 
perament and interest variety on the whole are the most useful. 



CHAPTBB XII 


TESTS FOR SUPERVISORY, PROFESSIONAL, 
AND EXECUTIVE PERSONNEL 


As the reader will readily recognize from the title, the present 
chapter represents something of a miscellaneous collection of 
jobs which it has not been possible to classify systematically in 
earlier chapters. Some professional jobs have something in 
common with some executive jobs, but the relationship between 
the others is not too apparent. The purpose of this chapter 
is to pull together some of the available studies treating occupar 
tions in tliese areas. 


TESTS FOR STTPERVISOSS 

Mental Ability. — ^Every supervisor has things to learn; every 
supervisor has some '‘paper work" to do. Both of these facts 
suggest the importance of mental ability in the supervisor’s job. 
Figure 5-13 showing the relationship between scores on the 
Adaptability Test and tenure supports this hypothesis. The fact 
that none of those men who scored less than 5 were on the job 
six months later and that 95 per cent of those scoring 15 or 
better were on this job is certainly significant. It must be kept 
in mind, however, that supervisory jobs vary tremendoudy. 
Some supervisors have more reports than others; some have a 
more extensive technology to learn than others; and some must 
utilize judgment and engage in creative thinking to a greater 
extent than others. All of these varying job requirements are 
no doubt related to the level of mental abUity demanded by the 
supervisory job in question. 

Harrell ^ has demonstrated the importance of general mental 

^ Habrhll, Willard, Testing the abilities of textile workers, Georgia School 
of Teohnology, State Bngineering Esuperiment Station BulU, July'; IfMO. 14. pp. 
Testing cotton mill supervisors, J. appU P^ychoUf 1940, 24, 31->36. 

171 



172 PBING1PLS8 OF PERSONNEL TESTING 

ability in supervision with a study of forty-two overseers in three 
different cotton mills. These men who had been rated as satis- 
factory or unsatisfactory by their supervisors were given the 
Otis Self-administermg Test of Mental AbUity. Test scores were 
converted to I.Q/s by means of the standard practice, and Fig. 
12-1 shows the results. When an I.Q. of 100 is considered as a 
critical score, 100 per cent of those meeting the critical score 
were among those considered successful whereas only 70 per cent 

IQ Of 101 or batter 

30% 

rO of (00 or (eta 

Par cent I Par cant acthfcclory 
unaotlsfoctory 

Eld. — Proportion of BtiperviBors abovo and below critical IQ. who wera 

conaidcrccl aalisfaclory and unaitiafncloiy (From Harrell) 

of those failing to meet the critical score wem considered satis- 
factory. This is ample evidence of the importance of mental 
ability in this particular job, but tlie fact that the prediction is 
as low as it is indicates that mental ability os measured by this 
test is by no means all that is needed in order to make a good 
supervisor. Although some studies similar to that reported by 
Sartain^ have failed to show significant relationships between 
measures of supeivisory success and selected tests, the facts seem 
to indicate that more often than not faulty criterion data or 
highly selected samples account for the findings. Shuman, ‘ for 
example, has reported tho successful use of the Otis Self-adminis- 
tering Test of Mental Ability, Bennett's Mechanical Compre- 
hension Test, and the Rems&i Minnesota Paper Formboard for 
identifying those supervisors rated excellent in three different 
plants that are quite different in nature. 

Attitudes and Interests. — Similar to the selling field, instru- 
ments that measure attitudes, beliefs, and interests seem most 

^ Sabtain^ a. Q. Kelation betwoen ecorca on certain Rtanrlanl tcata and super* 
visory buocgss in an Aircraft factory* /. appl. Psi/chol , lOdOi SO, 328-332. 

> 8human, John T. Tho value of aptitude teats for supervisory workers in 
the aircraft engine and propeller industries. J. appl» Peyokol,, 1040, 80, 185-105. 



100% 



SUPERVISORY, PROFESSIONAL, AND EXECUTIVE PERSONNEL 173 

promising at present for selecting supervisors. Pile’s test ‘ Hov> 
Supervisef has already demonstrated its value in a few validity 
studies reported by Pile and Remmers.’’ One study pertained to 
forty-six successful supervisors and fifteen nonsupervisors who 
had been by-passed because of judged lack of ability. These 
men, employed by a company engaged in the manufacture of 
office machines, took one fonn of the test, and the results are 
shown in Pig. 12-2. This figure shows the percentage of each 


20 %. 


Selected ae supervfeore 
85% 

Not selected 


180% 


Proportion In 
lower holf 


Proportion In 
upper half 


Fia. 12-2 —Proportion of men promoted and not promoted to anpervisory po- 
aition who scored in upper and lower half on How Supormoef test, File 

and Remmera,) 


group tliat fell above and below the fiftieth percentile (median) 
of the published norms. In other words, of those selected as 
supervisors 80 per cent scored above average whereas of those 
judged not to have sufficient supervisory ability only 15 per cent 
scored above average on How Supervisef 
In another study with the same test, twenty superior and 
twenty inferior supervisors were tested. The percentile equivsr 
lent of the mean of the better group was 96 and of the poorer 
was 63. Although both groups were above the average as repre- 
sented by general norms, a significant difference exists between 
those individuals whom the company consid«3r8 superior and 
those considered inferior. 


PROFESSIONAL PERSONNEL 

Engineers. — Studies dealing with the validation of test bat- 
teries for professional groups are rare, presumably for two 

^ Fius, Qvbntik W. The measurement of Buperviaory quality in industry. 
}. apjA, Psychol^ 1946, 29, 323-327 

* Fan, QuBamir W., and Bbmmerb, H. H. Studies ia superrisory evaloation. 
J. aj/iA. PsydioL, 1040, 30, 421-426. 



174 PRINCIPLES OF PERSONNEL TESTING 

reasons. (1) As a general condition, so few of any one profes- 
sional group are employed with one company or their work is of 
such a diversified nature that validation in tile accepted sense is 



Fia 12-3, — Proportion of ongmccrs in each of aix rated ability groups who 
were above and below critical score on vocabulary teat. {From Swarlt and 
Schaab.) 


frequently next to impossible. (2) The second point hinges 
upon the fact that since all professions require extensive training, 
the training procedure has a tendency to eliminate by natural 
means the mentally and temperamentally unfit. It goes without 
saying, perhaps, that the more realistic and the more practical 
the training the more nearly true is this statement. One study 
reported by Swartz and Schwab ^ deals with a group of thirty- 
seven engineers who were rated by their supervisor on their 
ability as research engineers on a scale ranging from A to F, 
They were given the Michigan Voaitnilia'y Profile Teat and the 
Minnesota Paper Formboard, and tho resulting relationships are 
shown in Figs. 12-3 and 12-4. Figure 12-3 shows the percentage 
in each rating group who were above and below a selected critical 
score on the three scientific vocabularies in the Michigan Vocabu-' 
lory Profile Teat. Note that all of the A and B rated people were 
above the critical score and that all of the F and F rated people 
were below the critical score. Figure 12-4 based upon the Minne^ 
aota Paper Formboard shows a similar trend which, though not so 
pronounced and regular, is nevertheless significant. 

* SwMcra, B. K., and Sqhwad, R. E. Experknee wUh amphyment fMfa. 
Studies m Fcreonnel Policy No, 32, National Industrial Conferenco Board, Inc., 
March, 1041. 




SUPERVISORY, PROFESSIONAL, AND EXECUTIVE PERSONNEL 175 

Nurses. — ^Dvorak’s study * presents data on a superior and an 
inferior group of nurses. Figure 12-5 showing group profiles for 
superior aaid inferior nurses indicates significant differences on 



Below crificol score I Above erttleol score 


Fia. 12-4w--Froportioii of engioGeiB in each of six rated ability groups wIjo 
were above and below critical acorc on Minnesota Paper Form board, (Prom 
Swartz and Schwab.) 


Pressy senior clossirication test 

Number checking 

Nome checking 

Finger dexterity 

Tweezer dexterity 

Box A 


FiOe 12-6^^roup profiles on two ability groups of nurses. Tbo solid line rep* 
resents the test performance of the superior nurses; the broken lino represents the 
inferior ouejs. (From Dvarah ) 

some of the teats. Two facts seem significant. (1) All but two 
of the tests seem to discriminate. (2) Where the tests do dis- 
criminate, the superior nurses are well above average. 

EXECUTIVE PEBSONNEL 

The Executive Type. — ^According to Cleeton and Mason * 
there is no general “executive type” of personality. They draw 

^ Dvorak^ BisATniCB. Differential occupational ability pattern. Univ, Minn, 
Butt. Bmplf/t. Stab, Bes, Inet,, Voh lllj No. 8, University of Minnesota Press, 

im, 

> Clbbton, Gusn V., AND Mason, Charugb W. Executive ability. Yellow 
Springs, Ohio: The Antioch Press, 1916, 





176 PRINCIPLES OF PERSONNEL TESTING 

their conclusion from the fact that Strong and others have not 
been able to demonstrate a clear-cut pattern of interests and 
attitudes of persons in executive capacities vs. those in non- 
executive capacities. One of Strong's most recent studies, ‘ 
though limited to public administrators, revealed extremely 
varied interests among the 517 men investigated. Cleeton and 
Mason do conclude that, in general, executives score relatively 
high on a wide vai’iety of ability tests, thus indicating that in- 
dividuals who successfully perform executive types of duties 
tend to be well-rounded personalities. 

Probable Value of Tests . — A survey of the literature yields 
little evidence of successful validity studies in the executive 
braclcets. This is no doubt due in part to the extreme difficulty 
attending the setting up of adequate criterion groups at the 
executive level, but it may also be partially attributable to a 
failure as yet to develop adequate measuring instniments. How- 
ever, at present, in addition to mental ability tests, those which 
seem to be most promising are the teraporament tests, the interest 
tests, and, to be specific, the Michigan Vocabulary Profile Teat. 

SUUUARY 

In the selection of tests for supervisory personnel mental 
ability tests generally “come through.” In supervisory, pro- 
fessional, and executive batteries, tests of the temperament and 
interest variety seem to be quite useful. It is probable that in- 
sufficient use has been made of vocabulary testa in executive and 
professional batteries. 

1 Strong, EDmiio K., Jr. DiiTerenoGa in intereats RiDong public nSminiatr^tpra. 
J. appl, PtycM,, 1047, 31, 18-38, 



CHAPTER XJII 


HOW TO CONSTRUCT A TEST 


The discussion thus far has dealt primarily with commercially 
available tests and their application in business and industry. 
However, individual plants, groups of plants, and trade associa- 
tions are finding more and more need for tests constructed to 
meet their own needs. In addition, in some instances tliere is a 
tendency for organized labor to agree with management to effect 
certain promotions and transfers on the basis of test results.' 
Particularly in the trade test area, the need for tailor-made trade 
tests for the purpose is apparent. In addition, there is a growing 
tendency toward the selective scoring of commercially available 
tests for a given purpose. Particularly is this true of tempera- 
ment tests in the selling area where it is less important to identify 
temperament components than it is to find those items and 
answers which actually discriminate between two criterion ^oups 
of salesmen. Whether it is a matter of test construction or the 
selection of significant items in a commercially available test, the 
problem of item validation is one and the same. The purpose of 
this chapter is to give some help in original item construction and 
to offer a simple but valuable method of item validation. 

FRBPAKING A TEST BUDGET 

Testing as Samplings — ^AU testing should be thought of as 
sampling. When one starts to test an individual’s knowledge of 
a particular trade field, it naturally is quite impossible to ask all 
of the potential questions covering all of the possible elements of 
information. Not only would this be impractical; it would be 
impossible. An analogy of a housewife making cake icing is use- 
ful. After she has the icing made, she adds a few drops of vanilla 
1 See Chap. IX, pp. 12M28. 

m 




178 PSmClPLES OP PERSONNEL TESTING 

extract, stirs it well, then tastes it to see if she has added enough 
vanilla. She does not have to eat all 0/ Me ii^ng to tell. She 
knows that one little taste gives a good indication of the whole 
batcli provided it is well stirred, and the taste is consequently a 
representative sample. The some principle holds in testing. A 
few questions are set up by which we “taste” the knowledge of 
the applicant's or employee’s knowledge. The estimate of his 
knowledge so gained is accurate insofar as the sample is repre- 
sentative. But again, like the cake icing analogy, if the sample 
is not representative, if the icing is not well mixed, spurious re- 
sults may be obtained. 

Content Outline. — ^In preparing a test over a particular trade 
content, then, the first step is to prepare an outline. Any trade 
can be broken down into a number of large areas or blocks. Each 
block, in turn, can bo subdivided into minor units or areas. The 
following outline whidi was llie basis for an auto-medianio's test 
is an example. 

OUTLINE OP AUTO UECHANIC’S TRADE 

A, Power Plant 

1. 7\ine-t<p 

2, Coolinis ayetem 

8. Fuel (iislribution 

4, Engine rcbnilding 

B. BSIcotricoi Syetem 

1. Basic informatioa 

2. Batteries 

3. Starting motor and generator 

C. Front End and Steering McohaniBin 

1, Steering gear 

2. Suspensioii eystem 

8. Applied geometry 

4. Service 

8. Bide control 

6. Bearings 

D, Chassis 

1. TransniiBsion 

2. Clutch 

8. Universal ioints 

4. Hear axlo 

8. Brakes 
Lubrication 



HOW TO CONSTRUCT A TEST 


m 

Jobs of lower than trade level can similarly be broken down. 
For example, one airframe plant broke riveting down into four 
major headings: drilling, riveting, subassembly, and skin rivet- 
ing. Each of these in turn was broken down into subparts. Out- 
lines such as these are essential if the sampling principle is to be 
followed. If an outline is not made, some areas are almost sure 
to be omitted and others are apt to be overemphasized. The 
importance of tradesmen and trade advisory committees in pre- 
paring such an outline cannot be stressed too much, Frequently 
union participation at this stage ensures union acceptance later. 

Further Analysis. — ^Although the content outline is essential, 
it alone does not ensure the preparation of tiie right questions 
or items. Referring to the above outline, the question may be 
raised “ What kinds of items can be prepared on tune-up ?" The 
following outline adopted from one developed for preparing trade 
analyses for deriving training content is useful: 

WHAT A WOfiKER NEEDS TO KNOW 

A, Work Procedures 

1. The “how to do question 

2. Tho “which should bo done first” question 

B, Technical Skills 

1. Reading drawings 

2. Making sketches 

3. Making drawings 

4. Laying out patterns 

6. Making calculations 

C* Job Information 

1. Trade terms 

a. Names of material and stock 

b. Operating terms 

e. Location names 

d. Machine and machine-part names 

2. Care of tools 

a. In use 

b. Not in use 

c. Prevention of loss 

3. Information about stock 

a. Recognition properties 

h. Working properties 

4* Job science 

a. Mechanics 

b. Hydraulics 



180 PRINCIPLES OP PERSONNEL TESTING 

c. Pnoumatics 

d. Elcclricity 

e. Heat 
/. Light 

g. Strcuglh of inateriala 
Chemistry 

fi. Safety mforniation 

a. PrevenUon of iiijuiy to self 
&. Pi^yeutioa of iojuiy to othem 
c. Prevention of injury to equipment 

Obvioudy, no single outline can be prepared that will serve 
equally well all job and occupational ai’eas. However, some 
breakdown of what the worker needs to know similar to the one 
above makes the preparation of questions easier. 

The Test Budget. — preparing the test the next step is mak- 
ing the test budget. This budget may take the form of a chart, 
similar to the one in Fig. 13-1, which actually combines the two 
outlines previously given. A chart of this character cnai then be 
completed by entering in each cell the proposed number of items. 
This phase can best be handled through some kind of a group 
or committee approach. Some companies are finding labor- 
management committees desirable. The one important point is 
to be certain that the committee includes representatives who 
are competent in the job area in question. It must be remem- 
bered that there is no magic substitute for judgment. The only 
way that values can be arrived at for inclusion in this outline is 
through judgment. 


CONSXRirCXIIfG IXEUS 

Types of Questions. — ^Up to this point the discussion has dealt 
solely with the subject matter to be tested. Now comes the ques- 
tion of what type of item to use. The unreliability of the so- 
called ‘‘essay question” is well known and will not be discussed 
here. Some form of objective, single-answer question is, of 
course, necessary. There are many such types, and they have 
been thoroughly discussed and illustrated by Remmers and 
Gage.^ This discussion will be confined to those types of items 

Rbhubrs, H. H., and Gaob, N. L. Udwsalwnal motnurement and evalwh 
lion. Now York: Harper, 1943. 



now TO CONSTRUCT A TEST 


181 


which have been most useful in industrial tests of the trade type. 
These are the true-false or yes-no item and the multiple-choice 


— 

WORK PROCEDURE | 

INFORMATION 

SAFETY INFORMATION | 

TOTAL 


! 

ITEM BUDGET 

FOR 

AUTO MECHANICS 

TEST 

TECHNICAL SKILLS 

TRADE TERMS 

CARE OF TOOLS 

STOCK INFORMATION 

JOB SCIENCE 


Tune-up 


9 

■ 


■ 




CCH 

uz 

So! 

Cooling system 


i 

■ 


■ 




Fuel distribution 


9 



■ 




Engine rebuilding 





■ 




it 

WQ 

iu£ 

1- 

Bosic tnfdrmotion 





■ 




Botteries 





■ 




Starting motor ond generotor 





n 




FRONT END 

AND STEERING 

Steering gear 





■ 




Suspension system 





■ 




Applied geometry 





■ 




Service 





■ 




Ride control 





■ 




Bearings 









CHASSIS 

Transmission 









Clutch 









Universal Joint 









Rear axle 









Brakes 






r~ 



Lubrication 


[ 






L_ 


Fra. iS-l^Blank form for uso in preparing test item budget. 


item. Others can and have been used, but these two are generally 
speaking most useful, and discussion will be limited to them. 
The true-false type is illustrated by the following: 


True False 

1, A atonige battery has separators between the plates — O □ 





















Ig2 PRINCIPLES OF PERSONNEL TESTING 

The testee is asked to indicate whether the statement is true or 
false by making an X in one of the squares. An alternative is to 
ask a question that can be answered with yes or no, The major 
advantages of this type of item are its simplicity, its ease of con- 
struction, and the fact that more of this type can be answered in 
a given time period than is the case with any other type. 

The multiple-choice or multiple-response item is illustrated by 
the following; 

1, In the formula, 4(i=7d+l6, { ) 

(1) +6, (2) “6, (3) +11, (4) -11 

The testee merely enters 1, 2, 3, or 4 in the blank to indicate the 
answer that he believes is correct. Sometimes four alternatives 
are used; sometimes three; and sometimes five. All facts con- 
sidered, four seems to be the optimum number. Although it is 
more difficult to construct than the true-false item, it is the best 
all-around type of item because it is least subject to guessing and 
consequently is more stable. Whereas with a true-false item the 
chances of guessing are fifty in ahundrod, with the four alternate 
multiple choice item the ciianccs are reduced to twenty-five in a 
hundred. As indicated before, there are instances where other 
types of items are as good or oven bettei', but gcmerally speaking 
it is the most useful type of item for personnel tests. 

Suggestions for Preparing Items. — ^Numerous books on 
psychological measurement are available, many of which provide 
usable suggestions. Notable among these is the one by Eemmers 
and Gage ^ already cited. A few of the more significant points 
with respect to multiple-choice item construction seem to be in 
order, however. 

1. Prepare some hard, some easy, and some moderately difficult 
items. 

2. Avoid trick or catch items. 

3. Avoid obviously wrong alternatives. 

4. Make each item os short as possible. 

6, Make all alternatives or responses about the same lengtih. 

6. Place correct answers in random order. 

7. Avoid “cue” words in the root of the item. 



HOW TO CONSTRUCT A TEST 183 

Number of Items. — the event Ihat time and conditions per- 
mit, more items sliould be prepared than are desired in the test 
after preliminary tryout. Two to one is a good ratio. Thus, if 
a finished test of 100 items is desired, the experimental form 
should include 200, and the item budget similar to the one in Fig. 
13-1 should provide for that number. This will permit the 
elimination of the less valuable and the bad items in accordance 
with the item-analysis procedure described below. 

iTEtf-ANALYSIS PSOCSUVRSS 

Item Difficulty. — ^Item-analysis techniques generally have two 
purposes: to determine the difficulty of the item or to determine 
the validity of the item. Item difficulty is me^ured in terms of 
the proportion, of the population tested that gets the item right. 
Thus, if one item is passed by 03 per cent of the population while 
(mother is passed by 78 per cent of the population, the former is 
easier than the latter because tlie probability that any single per- 
son will get it right is greater. Thus, by simply- counting the 
number of persons in an experimental population who get each 
item correct and converting these raw numbers to percentages, 
all items can be ranked in the order of difficulty. It is generally 
considered good test procedure, when a test is revised, to place 
the easiest item first, the hardest one last, and the intermediate 
ones in their corresponding positions according to percentage 
passing. 

Item Validation. — ^Difficulty alone, however, does not deter- 
mine the usefulness of a particular item. It is necessary to know 
whether or not the item is valid. Does it measure what the test 
as a whole measures, or does it measure what the test is supposed 
to measure? To evaluate the extent to which test items meet 
one or both of these criteria, several methods have been devised. 
An extensive discussion of these methods is beyond the scope of 
this chapter. However, one method of item validation that has 
been sliown to be satisfactory will be discussed. 

The Principle of Item Validation. — In evaluating the validity 
of test questions, a criterion is necessary. In other words, it is 
necessary to identify two groups of people, known or assumed to 
differ on the trait, attribute, or knowledge being measured, and 



PRINCIPLES OF PERSONNEL TESTING 


184 

to compare the performance of the two groups on the item in 
question. If the good or superior group gets the item correct with 
greater frequency than does the poor or inferior group, the item 
is assumed to be good. However, if equ^ or nearly equid pro- 
portions of tlie two groups got the item correct or if, perchance, 
the poor group does better, it is assumed that tho item is invalid 
for that purpose. Two methods are generally employed for se- 
lecting the criterion groups. These are known as the criterion of 
internal consistency and the use of an external ci'iterion. 

Use of External Criterion. — ^Wheu an extcnial criterion is 
used, some means, in no way related to the test, is found for 
identifying two groups that are known to be different and the 
performance of each gi'oup on each item in the test is determined. 
For purposes of illustration, the performance of naval trainees 
on the Practical Electrical Information Teat discussed in Chap. 
IV will be used. An experimental form of the test, consisting of 
116 items, was administered to 237 trainees at the start of the 
training program. Four sample tnic-falsc items with their 
original numbers follow: 

150« A retum-oall pu&h-button eystom Ihvoo wirca* 

loo. In rigid iron conduit^ one sido of tho Imo ia grounded, 

196, A household clcolric motor records tho voltage used. 

228, When an nutomobilo horn is usccl, losa current gooa through tho button 
circuit than through tlie horn. 

Items 156, 195, and 228 will be discussed first. Now these three 
items were equal or nearly equal in difficulty, since 59 per cent 
of the total group got number 156 correct and 53 per cent of the 
group got numbers 195 and 228 correct. However, as determined 
later, they differ considm'ably in validity. After the fifteen 
weeks’ training period had passed, and aitei* eadi of tiie 237 men 
had mode a record in the school, the group was divided into the 
25 per cent making the highest grades in the school, the 25 per 
cent making the lowest grades, and the middle 50 per cent. A 
study was then mode of the proportion of each grade group who 
got each item correct. F^ure 13-2 shou^ the results. Note that 
on Item 105, 36 per cent of the low group, 5d per cent of tho 
middle group, and 77 per cent of the high group answered the 



HOW TO CONSTRUCT A TEST 


186 

item correctly. This is then a valid item. Note also tliat a simi- 
lar relationsliip exists on Item 156 but that the spread is uot ao 
great. This is still a valid item, but it is not so good as 195. 
Note, then, the results on number 228. The percentages of 53, 
54, and 51 indicate that the performance of the three groups is 
essentially identical. In other words, a member of the superior 



Botlom Middio Top 

fourth half fourth 




ltem#l95 


Item# 156 


item^22a 


Fig, 13*2^ — ^Peroentago of traineeg recaiving courso grades in the bottom fourth, 
middle half, and top fourth of Uio class who passed each of three questions in an 
experimental form of tlie test* 


group is no more likely to pass the item than is a member of 
either of the other groups. This item is said to be invalid and is 
discarded. 

Figure 13-3, based upon tlie results obtained for Item 160, 
illustrates the type of reversals that may occur. In this instance 
the poor group actually surpassed the good group, since 73 per 
cent of the former and 61 per cent of the latter got the item 
right. Obviously, items of this sort operate at cross purposes 
with the rest of the test and must be eliminated for best re- 
sults. 

Discrimination Values. — While the items used above for illus- 
trative purposes could be accepted or rejected by inspection, it 
has been found helpful to use an index to describe the validity of 



186 PRINCIPLES OF PERSONNEL TESTING 

test items. Figure 13-4 is a nomograph ^ designed to assign such 
an index known as the discrimination value, or i)-value. While 
Figs. 13-2 and 13-3 show the performance of three groups, higli, 
low, and middle, for purposes of determining a Z)-value only the 
two extreme groups are necessary. The nomograph is used as 
follows: (1) Look up the percentage of the high group passing 



Hem Na160 

Em, 18'3;— Percentago of trainees receiving courso grades in the bottom fourth, 
middle half, top fourth of tlie class who passed ii given question in an experi- 
mental form of the test. 

on the scale at the left and the corresponding value for the other 
group on the scale at the right; (2) connect the two points with 
a straightedge; (3) read the value on the middle scale where the 
straightedge intersects the scale line. Thus, the il-values for 
the items used earlier are: 195, 1.1; 156, 0.6/ 228, 0; 160,-0.3. 
Note that in the case of Item 160 the JD-value is negative; negar 
tive signs always appear when there is a reversal, that is, when 
ihe poor group docs better than the good group. In this fashion 
every item may be assigned a JD-value that is indicative of its 
validity. As will be shown later, it is then a simple matter to 
select the most valid items for inclusion in the revised form of 
the test. 

^ Lawsuh, G. H., Jn. A nomograph for estimating tlie validity of test items. 
J. appL Psyoftot, 1943, 20, 846-S49. 



HOW TO COmTBVCT A TE&T 187 



FxiQ« 13-4^A nomograph for aBsigniog Z)-vaIUQB (discrimination values) to 
test queatioDB. {From Lamhe,) 



188 PRINCIPLES OF PERSONNEL TESTING 

Criterion of Internal Consbtency. — ^Evcn though a trial or 
experimental form of a test is almost certain to include some poor 
items, the best assumption is still that those getting the highest 
scores are the best and tliose getting the lowest scores ore the 
poorest. For this reason extreme scoring groups are frequently 
taken as criterion groups. For example, if 200 people take a 
test, even though there are certainly some bad items, there is 
still little question that the 50 scoring highest arc superior in 
whatever the test measures than arc the 50 scoring lowest. Con> 
sequently, it is sometimes customary to set up tlic high 25 per 
cent and the low 25 per cent, the high 30 per cent and tlie low 30 
per cent, or other extreme proportions as criterion groups with 
reasonable assurance that the two differ in the quality or knowl- 
edge being measured. The problem then becomes one of com- 
paring the performance of these two groups on each iteni of the 
test. 

For purposes of illustration, two items from an experimental 
form of an interest inventory for sales personnel are presented. 
This inventory was administered to 300 candidates for sales jobs 
of varying types. The items were scored on the basis of the best 
logical analysis that could be made, and the lOO persons making 
the highest scores and the lOO making the lowest scores were 
identified and designated as criterion groups. One question was 
as follows: “Do you like to think a long time before expressing 
your opinions?” Provision was made to answer “yes,” “no,” or 
“maybe.” The “no” answer was given by 67 per cent of the high 
group and by only 22 per cent of the low group. This item, as 
reference to Fig. 13-4 will show, has a D-voluo of 1.2 and is 
highly valid. On the other hand, to another question tliat read 
“Do you rather enjoy spending an evening alone?” 42 per cent 
of each group answered “No.” The D-value is obviously zero, 
and the item in no way discriminates between the two groups. 

USING item-analysis RESULTS 

Revising the Test.— -Botli of the procedures outlined above are 
useful in identifying the most valid items. As pointed out be- 
fore, the methods differ only in the means whereby extreme 



HOW TO CONSTRUCT A TEST 189 

groups of people are identified. Once i)<yalues have been ob- 
tained, a distribution similar to the one in Table VII should be 
prepared. This is a frequency tabulation of the actual D-values 
obtained in the administration of the experimental form of the 

TABLE VII.— FHEQUENCY DISTRIBimON OP JXVALUES OP 
ITEMS IN EXPERIMENTAL TEST 


/>-valnQ 

Tabulation 

f 

11 

/ 

1 

].0 

/ 

1 


/ 

1 


im// 

7 

0.7 

rm /// 

8 

0.6 

rwrm/// 

13 


7W m/ /// 

13 

0.4 

rmm if 

12 

0.8 

m m fH4.fi 

17 

0.2 

TfH HU 

0 

0,1 

IfHtfH If 

13 


TtH-m flH 

14 


fill 

4 

-0 2 

f 

1 


If 

2 

-0.4 


0 

-0.6 


0 

-0.6 


0 

-0.7 


0 

-0.8 

1 

_1 

Tolnl... .116 


Practical Electrical Information Test described above. Notice 
that the D-values range from 1.1 to —0.8. The next problem of 
the test maker is to decide how many items to retain. This is a 
question for which there is no generally applicable answer. Gen- 
erally speaking, a long test is better than a short test but, of 
course, not if the long test is simply the short one with poor items 
added. The problem then becomes one of starting with the high- 
est jD-value item and moving down until enough items have been 
included to moke a teat of reasonable size. In the example at 
hand, a cutoff at 0.4 was used. There were fifty-six items with 







190 PRINCIPLES OP PERSONNEL TESTING 

JD-values of 0.4 or better. One of these was diseased for a 
reason not associated with the present discussion, leaving fifty- 
five to be included in the revised test. As indicated in Chap. IV, 
this short test proved quite useful in identifying potentially suc- 
cessful naval electrical trainees. Arbitrary cutoffs are hazardous 
because of the differences in pairs of criterion groups. However, 
experience has indicated that in the average situation, Z)-vaIuea 


TABLE VIIL— DISTRIBUTION OF TIME REQUIRED BY 237 
PERSONS TO COMPLETE A TEST 


Mm. 

TabulaUon 

B 

B 

Por- 

oen- 

iile 

15 

/ 

1 


100 

14 


0 


00 

13 

w/ m //// 

14 


00 

12 

Tw mmmm / 


222 

04 

11 

rmrm TfUfHi /{/ 

23 

106 

83 

10 

mrmTfHm TmmimtM 


173 

73 

0 

tmrmm f/ 

47 

133 

66 

8 

rm m TM^-m rm m mm rm 

45 

80 

so 

7 

mmmmm 

25 

46 

17 

6 

mm/ 

11 

16 

7 

6 

//// 

4 

6 

2 

4 

/ 

1 

1 

1^9 


less than 0.3 or 0.4 seldom contribute to a more reliable or valid 
test. The problem still remains one of balancing maximum-size 
2>-vaIucs with maximum test length. 

Establishing Time Limits for Tests. — ^As a general principle, 
mental ability tests are the only ones in which speed is a factor. 
Consequently, the reader will rarely if ever need to consider 
speed in score determination. However, even on a straight-for- 
ward trade information test, some time limit is necessary. It is 
customary to set such time limits so that there is little emphasis on 
speed and so that nearly everyone can finish. This can be achieved 
by administering the test on a tiial basis with no time limit and 
asking each person taking the test to record the time when he 
finishes. These time values together with the storting time can. 














ROW TO CONSTRUCT A TEST 


191 

be used to tabulate tiie time required for the total group. Table 
YIII is a sample distribution. Note that the time required 
ranged from four to fifteen minutes. By means of the tabula- 
tions and the values in the accompanying columns, a reasonable 
time can be cstabliriied. Since only 73 per cent have finislied in 
ten minutes, that is obviously not enough time. In this particu- 
lar test thu'teen minutes was establislied. Note that 99 per cent 
(aU but one of 237) have finished. To extend the time two more 
minutes for one person seems unreasonable. When knowledge 
or information types of tests are being used, os indicated earlier, 
items are usually airanged in order of difficulty with the most 
difficult items at the end. Since there is some relationship in 
this kind of test between rate and accuracy, the probability that 
these very slow people will get the last items correct is small. 
Therefqre as a general principle no damage is done by setting 
the time limit so that even as many as the slowest 5 per cent can- 
not finish. 



OHAPTBR XIV 


INAUGURATING AND OPERATING A 
TESTING PROGRAM 


The preceding thirteen chapters in this book have dealt with 
the technology of personnel testing. Every personnel man and 
every other person who has held a responsible job in business or 
industry know that good technology is frequently scuttled be- 
cause of improper methods of initiation and operation. The pur- 
pose of this final chapter is to crystallize a workable point of 
view for test program administration. 

First Considerations. — No company should undertake the use 
oj psycholofficd tests mthout the assistance of a person specif-- 
ically trained in the use of such tests. Generally speaking, the 
management has three alternatives: (1) to add a well-rounded, 
fully qualified personnel psychologist to the staff; (2) to secure 
the assistance of a qualified consultant * who is on the staff of a 
university or a responsible consulting firm; or (3) to select a 
member of the present personnel organization who has some in- 
terest or background in the field and send him to a recognized 
university for training. Frequently a combination of the last 
two is the best solution if the company feels that it is not large 
enough to justify the addition of the services of a full-time psy- 
chologist. 

Too often companies have “dabbled” with tests when no 
individual in the organization was qualified beyond being “in- 
terested.” Instances of this character more often than not lead 
to imuble and frequently result in tests being “kicked out.” A 

^ While there nre many qualified persons doing oonsulting urork in the testing 
field, there are also many so-callc<l ‘'cotKultanU" who have noithor the training 
nor experience to do the job. Ropiiteblo psychologists are usually membora of 
the Amorcion P^cliologioal Association, 1616 Moasaohusotts Ave., N.W,, Wa*- 
ington 6, D,C. 


102 




INAUGURATING AND OPERATING A TESTING PROGRAM 193 

testing program built on solid ground and guided by a com- 
petently trained individual never experiences this misfortune 
and usually grows in importance and respect. 

THE BASIC PSOCEDURB 

At the risk of oversimplifying the problem, a basic procedure 
consisting of seven fundamental steps will be outlined here. 
They are 

Step 1. Establish the personnel-testing policy 

Step 2. Introduce the program in the plant 

StepZ. Identify the jobs or departments having personnel 
problems 

# 

Step 4. Obtain job descriptions for these jobs 

Step 6. Select tests for tiyout 

Step 6. Select criterion groups and administer tests 

Step 7. Establish the operating test program 
Each of these will be discussed individually. 

Step 1. Establish the Pensonnd^testmg Policy 

Management Support Necessary. — ^Few if any testing pro- 
grams succeed witliout the support of top management, and few 
that have the support of top management fail providing tech- 
nical competence is available. Authority, however, should not 
be confused with support. Many managements have authorized 
programs of one kind or another without whole-hearted support. 
The result is frequently failure. Top management must genu- 
inely believe that personnel tests ore tools that are essential in 
the administration of an onlightenei! personnel policy. 

Getting Management Support. — How can management be 
convinced that tests are deserving of managerial support? In 



104 PRINCIPLES OF PERSONNEL TESTING 

general, this support can be obtained in one of two ways: through 
the experience of other companies or through pilot studies in the 
plant in question. Frequently this latter approach is extremely 
effective. If one department or one operation where a severe 
personnel bottleneck exists can be selected for a tryout and real 
results can be shown, almost any management will sit up and 
take notice. Evidence of this kind supported by forecasts of 
future employment and personnel problems confronting the com- 
pany in which tosts might help is useful. 

Budget, Location, and Policy. — In any far-reaching program 
a budget is, of course, necessary. The failure of management to 
establish a budget, however modest, is evidence of lack of the 
support mentioned earlier. Where the testing program should 
be located and what department or administrative officer sliould 
be responsible are questions that cannot be answered without a 
knowledge of the variables involvctl. Obviously the 'function 
should be located somewhere in tlie industrial-relations depart- 
ment or division. Whether it is attached to personnel, training, 
or employment or is a separate unit must be decided in terms of 
company size, qualifications of personnel in these departments, 
and the particular emphasis that the company wishes to place 
on testing. Wherever the placement, the beginning should be 
modest. Physical facilities and personnel should be geared 
to present needs and sliould not be overambitious; when testing 
produces results, growth will come naturally. 

Written Policy Desirable. — Policies are always better when 
they are written, and personnel-testing policies are no exception. 
Once management has decided to support a testing program, the 
purpose of the testing program, what test data are to be used 
for, and the responsibilities of those in charge should be made 
a matter of record. Below is a sample policy for a company 
sufficiently large to support a separate testing unit known in this 
company as a division. 

Qrmorad Porjcrr or DiviBicir or Pkrsonni!Ii Tbstino 

The purpose of this policy is to define the function of the Divi- 
sion of Personnel Testing in regard to its furnishing information 



INAVGVRATING AND OPERATING A TESTING PROGRAM 196 

and recommendations on the selection and placement of person- 
nel and the establishment of personnel standards for given occu- 
pations and jobs. 

I 

The Division of Personnel Testing is authorized to and ip 
responsible for the securing of objective information from appli- 
cants for employment and from general employees as directed by 
the personnd manager. The function of tiie division is one of 
gathering information with the use of objective and scientific 
methods and reporting on it. Its function is not one of making 
decisions. 

II 

As a consulting service, ihe Division of Personnel Testing may 
assist any authorized department or supervisor in securing and 
analyzing information on any applicant, employee, or group of 
employees for the purpose of determining qualifications for em- 
ployment, transfer, and assignment and for setting job standards 
for personnel. Such information will bo so presented that on 
objective and complete appraisal can be arrived at without preju- 
dice or harm to the character or reputation of the applicant or 
the employee. 

m 

The personnel manager sliall be responsible for the establish- 
ment and maintenance of a committee of responsible representa- 
tives from operating supervision and the personnel department. 
This committee on pei'sonnel toting shall be responsible for the 
establishment of a program to meet the needs of ^e company for 
this service and shall be responsible for the program’s being car- 
ried out efficiently and effectively. 

IV 

The Division of Personnel Testing will keep accurate and 
dated records of all its testing and recommendations. The divi- 
sion shall keep all of its records on persons tested strictly confi- 
dential, available only to authorized supervision and manage- 



196 PRINCIPLES OP PERSONNEL TESTING 

ment. It shall use only methods and matorials acceptable to its 
profession and shall conduct its affairs and make its reports on a 
strictly objective basis. It sliall not discriminate against or be 
prejudiced by any person, applicant or employee, because of race, 
creed, color, country of origin, or union affiliation. 

V 

Any supervisor or other person in authority shall fed free to 
discuss, in line with his work and his own subordinates, the test 
results and their meaning with any of his people. The function 
of the division is to make available such information that it might 
have or be able to secure and to inteipret it upon proper request. 
The division will not give out abstract test results as such, nor 
will it make any decision with respect to any individual tested. 
It is, in every instance, the responsibility of the supervisor or the 
committee on personnel testing to make decisions concerning the 
hiring, placement, or movement of employees. 

VI 

Any employee of the company should feci free to discuss his 
own test results with the supervisor of personud testing if he 
has a problem that he feels will be benefited by such a confer- 
ence. No test results will be given to the employeo; thdr mean- 
ing and significance will be discussed only according to acceptable 
counseling procedure. No test results will be available to any 
outside agency without the written consent of the employee con- 
cerned. Test results of applicants are not available to the 
individual. No test material may leave the division or its juris- 
diction. 

vn 

No provisions of this policy on personnel testing shall be 
changed or added to without the written approval of the plant 
manager after such recommendation for change has been made 
in writing by the committee on personnel testing. A budget 
shall be set up within the personnel department for the Division 
of Personnel Testing, and the division will stay within its bud- 
get. 



tNAVGVRATINO AND OPERATING A TESTING PROGRAM 197 

The exact nature of the policy will vary from plant to plant; 
and although the example above would not be satisfactory in all 
situations, it nevertheless illusti'ates the point. Such a policy 
should be clear, direct, and brief. 

Step 2. Introduce the Program in the Plant 

Function of Tests. — ^The reader should continually keep in 
mind the fact that tests are only a tool in the employment situa- 
tion. The facts that they supply are only additional information 
which, together with application and interview data, contributes 
to better employee selection and placement. It is important that 
test data (and data from other sources) should not be used to 
force a particular employee down the throat of the supervisor. 
The most successful approach is to (1) screen out obviously un- 
fit applicants, (2) identify those most likely to succeed on a par- 
ticular job by means of test and other data, and (3) refer the 
candidate or candidates, together with data, to the supervisor in 
question. The supervisor should make the final decision. In the 
last analysis the supervisor must live every day with the choice, 
and he alone is responsible for success or failure in his depart- 
ment. To take away from him his right to choose those who are 
to report to him is to deprive him of one of his important super- 
visory functions. Tests are only aids and should not in any way 
he used to shear the supervisor of any of his authority. 

How to Get Supervisory Support. — ^In general, the same tech- 
niques are effective in getting the support of supervisors as are 
useful in getting the support of top management. Top manage- 
ment authority is no substitute for supervisory support. Super- 
visors in any plant in which testing is being inaugurated will 
vary in their attitudes all the way from violent rejection to en- 
thusiastic acceptance. One of the first jobs of the testing-pro- 
gram director is to tone down the enthusiasm of the latter and 
“sell” the former on the value of tests. This can best be done 
through pilot programs. That is, a situation should be selected 
where there is a real personnel problem and in which the par- 
ticular supervisor is willing to “go along.” Once results have 
been obtained, they should be presented to other supervisors by 



198 PJilNOIPLES OF PERSONNEL TESTINQ 

means of the kinds of graph and cliart used in this book. 
Slowly the program will grow, and eventually one supervisor 
after another will request testing services. There is little or no 
merit in swinging an entire plant into a testing program all at 
once. The success of a testing program is not measured in teirnis 
of such statements as “We tested so many people last month.’* 
Success must bo evaluated in terms of the number of job classi* 
iications for which batteries have been validated plus the genu- 
ineness with which the program is accepted by supervisors and 
others. 

How Not to Evaluate Tests. — ^Althougli there is usually no 
harm in permitting supervisors to look over or even take tests, 
the person in charge of testing should not get himself in the 
position of letting supervisors decide on the merits of a test in 
terms of how it looks. Supervisors should be trained to accept 
or reject tests on the basis of facts similar to those presented in 
this book and not in terms of whether or not he likes a particular 
test or a particular question in that test. 

Arguments with those who object to testing never accomplish 
anything of a constructive nature. If a particular supervisor 
cannot be won over on tlie basis of personal relationships plus 
facts, he certainly cannot be won by argument. Statements 
such as “Skeptics are the best friends of testing” and “If every- 
one was as skeptical as you, there would be fewer ‘quacks’ in the 
testing field” are often enti'ees to a satisfactory relationslup. 

Preemployment Testing and Organized Labor. — ^Preemploy- 
ment testing is completely outside the jurisdiction of the union 
except in instances in whidi the closed shop functions and the 
union actually supplies employees to the company. When an 
open or union shop exists or when such union security devices as 
maintenance of membership are operating, management may 
select anyone whom it chooses and it may ask any questions that 
it wishes orally in the interview or in writing in the form of 
application blanks or tests. 

Union Support Needed.— -However, union support is necessary 
if the present employee method of test validation described in 
Chap. II is to be used. That is, if the company wishes to pull off 



INAVaURATINO AND OPERATING A TESTING PROGRAM 199 

tile floor a number of machine operators, presently on the job, for 
the purpose of setting test standards to be used in new hiring, 
union support is necessary. 

Upgrading. — ^The use of tests in the upgrading or transferring 
of present employees is a matter of concern to union officials. 
It is a matter that can diplomatically be worked out as a general 
rule provided there is union representation in the early planning. 
However, if a plan is devised and union officials are asked to 
accept it, differences usually aiise and sometimes testing is com- 
pletely blocked. 

Some unions, although not many, are on record as being 
opposed to tests as well as to other devices of scientific manage- 
ment. The vast majority, however, are open-minded. Although 
efflentially all unions have gone "all out" for seniority as the 
criterion for promotion, a significant fraction of all contracts arc 
written with a clause sucli as "where ability and physical fitness 
are relatively equal, seniority shall be the determining factor." 
Management at times has been weak in its ability to demonstrate 
the superiority of an employee. There are extremely promising 
signs throughout the country that many unions are becoming 
more willing to measure certain kinds of ability by means of 
tests.^ 

Most Unions Can Be Sold.— Where management is sincere, 
where there is a history of genuine collective bargaining, and 
where there is a reputation of managerial fair play, organized 
labor nine times out of ten will go along with a well-grounded 
testing program. If there has been excessive friction, tests will 
be dubiously received if at all. If management has a reputation 
(based upon real or imaginary happenings) of shady practices, 
testing will be looked upon with suspicion. 

Step 3. Identify the Jobs or Departments Having Personnd 

Problems 

Criteria for Choosing. — Tests ore of no value if they do not 
help management to do a better job of managing. Unless a 
managerial problem can be solved or can be handled better as the 

1 Bee Chap. IX, pp. 127-128. 



200 PRINCIPLES OP PERSONNEL TESTING 

result of a testing program, pragmatically there is absolutely no 
point to testing. This statement, of course, assumes manage- 
ment's ability to identify problems. The testing program sliould 
be inaugurated in a modest way. One or two jobs or depart- 
ments in which there are known problems sliould be identified. 
These should be selected because they ore known to be high in 
turnover, high in training costs, high in absenteeism, high in 
wastage, or low in production or /or some otfier tangible reason. 
Vague generalizations to the effect that ''morale is low on that 
job” ore of little help. Unless the problem is tangible, the ap- 
proach to its solution tends to evade objectivity. Eeasons for 
selecting a few problem spots of this sort are evident: (1) They 
are known to be problem spots throughout the plant generally, 
and (2) any improvement that is brought about is readily ap- 
parent. 

Step 4. Obtain Job Descriptions /or These Jobs 

Job Analysis. — Job study after the fashion suggested in Chap. 
II is necessary at this point. It is highly important to take ad- 
vantage of any prior job study that may already have been con- 
ducted in the organization. It is well to contact those in charge 
of methods and procedures, motion and time study, job evalua- 
tion, and safety because in many instances valuable job analyses 
have already been conducted. Earely ore these completely ade- 
quate, but quite often they provide a firm basis upon which to 
build, and it is not uncommon to find that the key factors in a 
job have already been identified. 

Step 5. Select Tests /or Tryout 

Also as outlined in Chap. II, at this point several tests should 
be selected for tryout. The number and character of tests should 
be determined by the job itself and the relative importance of 
improved selection in ^c job. 

Step 6. Select Criterion Groups and Administer Tests 

Procedure. — The whole problem of selecting criterion groups 
has been extensively discussed in Chap. II, and little can be 



INAUaVRATINa AND OPERATINO A TESTING PROGRAM 201 

added here. Extreme care should be taken in the case of the 
present employee method to make certain that everyone, the 
supervisor, the union official, and the individual employee, knows 
why the employees are being tested. It is well to explain again 
that "we are testing the tests** and that how the employee does 
will in no way affect his job. 

Step 7. Eatdblish the Operating Test Program 

Evaluating Tests. — At this point, those tests which show an 
actual relationslup with one or more criteria should be identified. 
Too much emphasis cannot be placed on the necessity of using 
one of those teclmiques (or a similar one) outlined in Appendix 
A for determining the extent to which the results obtained might 
be attributable to chance alone. The possibility of tests "back- 
firing'' can bo virtually eliminated if these checks ore always 
applied. 

Critical Scores. — In some instances it will be desirable to es- 
tablish critical scores, tlmt is, cutoff points designating those 
portions of the score range which are acceptable and those 
which are unacceptable. More often, however, because of fluc- 
tuating personnel demands and fluctuating labor markets, a 
more flexible scheme will be desirable such os A, acceptable; 
B, acceptable only under certain circumstances; and C, unac- 
ceptable. 

Peiaonnel Records* — ^The necessity of adequate personnel 
records is apparent. Those in charge of personnel research 
should set up some sort of summary sheet to be filed with the 
personnel record. In addition to test data, such a sheet should 
include transfer, promotion, rate change, and termination data. 
Absentee facts, a record of accidents, merit-rating results, and 
any other data likely to be useful for criterion purposes should 
be included. Such a sheet, filed with the peisonnel record, is 
conveniently located and yet may be withdrawn for research 
purposes. 

Management Reports. — ^Many otherwise good testing pro- 
grams tend to stagnate or become lost because top management 
is not kept informed. It should be the policy of those in charge 



202 PSmaiPLES OF PEnSONNEL TESTINQ 

of testing to provide periodic progress reports for this purpose. 
A few simple rules should be followed: 

1. The report sliould be brief, no more than two or three pages 
when possible. 

2. It should stress true accomplishment rather than effort. A 
statement of what was accomplished with respect to one 
job is more impressive than any statement of how many 
were randomly tested. 

3. Graphic presentations similar to those used throughout this 
book and described in Chap. IV should be used. 

4. Involved statistics (such os probable errors, coefficients of 
correlation, etc.) should usually be avoided in the report. 
Those statistical techniques which must be used should be 
kept behind the scenes except in unusual cases. 

SUMMARY 

A few “don’ts” adapted from Ruch^ are in order by way of 
summary: 

1. Don’t expect one test to solve all your problems. 

2. Don’t xely exclusively upon tho advice of professora. (This 
book was written by a professor.) 

3. Don’t expect to carry over a complete program from another 
concern. 

4. Don’t regard testa solely as a means of rejecting employees. 

6. Don't expect perfection in a testing program. 

6. Don’t let the name of a test mislead you. 

7. Don’t lump subtest scores into totals; analyze them sepa- 
rately. 

8. Don’t start a testing program until you have a capable 
trained person to handle it. 

1 Rucii, Fu>td How lo vac emptoyment teata. Bull. No. 1, Lob AngeleB: 
ColifomlB Teat Duraau, 1044. 



APPENDIX 




APPENDIX A 

SAMPLING THEORY AND PRACTICE 


Throughout the text proper the baeio point of view has emphasiied the neoee* 
sifcy of comparing the teat and job performance for two or more groups of em- 
ployees. Sometimes the tedmique has involved the comparison of the mean or 
average teat scores of one group that has turned out well on the job with the 
average test scores of another group that has not turned out well. Occasionally^ 
the reverse procedure has been uacdi in which the average production of a group 
of high scorers was compared with the average production of a group of low scorers, 
Jn other instancoa compariaoDa between two percentages have been made. Dur- 
ing these discussions no comment has been made regarding how large a difference 
must bo «found between tlic two groups before it can bo taken at face value or be 
considered significant. The purpose of this appendix section is to provide the new 
worker in the field of personnel testing wiUi procedures for evaluating such dif- 
ferences. ToclmicnJ discussions of sampling theory aro not included. 

Need for Evaluating Differences.— Suppose that twenty people who take 
a oortain proemployment test are placed on the same job. The group of twenty 
is divided into two, so that the ten who mode the highest scores are placed in one 
group and the ton who made the lowest scores are placed in another. Performance 
records for members of the two groups are examined at the end of three months 
or so, and nvorago production figures aie computed. Suppose, for purposes of 
illustration, that tlieso are assemblers of some kind and that the average number 
of units por hour assembled by the high scorers is 17.0 whereas the corresponding 
figure for tho low scorers is 16.4. The difference obtained between the two is 2.2 
units. The following questions now arise: Is this diffcrenco Biiffioiently large to 
justify the use of tills particular teat for the future selection of employees? 
Docs tho test really identify belter asaemblers, or was it just a matter of chance 
that tho high scorers did better on tho job than Uie low scorers? If this differ- 
ence of 2 2 is considered significant, would 1.8 likewise by significant? If so, how 
small could tho difference bo and yet leave little doubt that the use of tho test 
is justified? These questions make it evident tliat some technique or method 
must bo employed to evaluate differences between means or averages and between 

percentages. - , 

Differences Can Occur through Chance.— Any player of bridge, poker, or 
other card game knows that differonooa can come about through chance alone. 
Following a completely random shufilo, a bridge player frequently discovora that 
he has more cards of one suit than of the others in his hand. Although the ex- 
pected number is three or four, lie quite often finds five, six, sevon, or even eight 
of one suit in his hands. Only rarely does ho find more, but there are uutanoes 
of the perfect bridge (all one suit) on record. If an individual were not aware of 

206 




206 PRINCIPLES OP PERSONNEL TESTING 

the mnko^up of a standard deck of playing cards, and if ho attompted to guess 
the make-up of the deck after having been dealt hia first hand consisting of two 
olubs, six diamonds^ three hearts, and two spades, lio niiglit make n serious 
error. If ho were to guess that tho entire dock contains threo times ns many 
diamonds as chibs, he would, of course, be wrong. Even when ihn odds aro in 
favor of getting idontical numbers of each suit, differences do occur through 
chance alono. llie same is true in testing. Sometimes when two groups of em- 
ployees make dilTcront avcnigo scores or come up with diftorent average produc- 
tion records, we find that tho diiTcrenccs have occurred through olianco and that 
the groups arc really equal. 

BIFFERENCBS IN MEANS 

The Significance of Differences in Means.— Statisticians have developed 
ways of placing an evaluation on a given difTcrence. They have provided ^vays of 
arriving at the amount of confidence that can bo placed in a particular dilfcrenco 
that is found. It la beyond the scope ^ of this book to discuss the statistical 
theory and mnthcmAtical derivation of formulas. However, below is a simple 
step-by-step proccduro that can be applied by almost anyone. 

PROCEDURE FOR EVALUATING DIFFERENCES 
BETWEEN MEANS OF TEST SCORES 

Step l^Assiiming that employees have been divided into grnuiis on tho basis 
of job performance, consider one group good \g) and the other 
poor (p). 

steps.— Count tho number of good employees 

Step S^Compute the moan (Mg) test score of tho good group. (Sum tho 
scores, and divide by 

Step 4.— Compute the variability (V^ of the scores of the good group. 

a. Square each score. 

b. Add the squares, 

c. Divide this sum by Ng 

d. From tliis quantify subtract tho squnre of tho mean (ilfp). 

Step5^1lepcat Steps 2, 3« and 4 using tho acorcs for tho poor group. Call 
these Ng^ Mg, and Vg. 

SCep6r-Computo the number of degrees of freedom, (Add Ng to Np, and 
subtract 2.) 

Step 7r-ConiputG tho difference (Dp^) between tho two moans. (Using Af^ 
and Mp, subtract tho smaller f^om tho larger.) 

1 See some standard work in statistics such as LiNsquiar, E. F. A first course 
tn siaUstics, Boston: Houghton MifHin, 1042. 



APPEl^DtX A 


207 


Computo tho standard error* 

o. Multiply Vg by Ng, 

b. Multiply Yg by Ng, 

0. Add tho two produots. 

d. Divide this sum by '^dojp-eea of freedom.” (See Step 6.) 

c. Now sura Ng and 7/p, 

/. Multiply Ng by Np. 

g. Divide tho sum obtained in line e by the product obtained in 
line /. 

b. Multiply this quotient by the quotient obtained in d above. 

1. Obtain the square root of this product Tliis final answer ia the 
standard error of tho di/Terenco between means. 

Step 0.— Compute the sisnificance ratio (t). (Divide the difference obtained 
in Step 7 by tho estimated standard error obtained in Step 8.) 

Step lO^Using t obtained in Step 9 and the number of degrees of freedom 

obtained in Step 6^ from Table IX determine tho probability. 

a. In the tabic locate the appropriate number of degrees of free^ 
dom jn tlio left columiu 

h« On the same lino find tlio t value equal to or next smaller than 
the value obtained in Step 9. 

e. Note tho proportion nt the top of tho column in which this 
valuQ is found. This is tho probability that tho difference 
occiUTcd through chance. 

Meaning of Level of Confidence. — ^Pfiychologlsts os a general rule start 
their investigations with tho so-called "null hypothesis" which nssumes tlio groups 
to be equal or^ moro acouratcly, ossumoa there Is no real difference and that such 
differences as are obtained have occurred solely througli chance juRt as the two 
clubs and six diamonds occurred by chance in the bridge hand in the earlier dis« 
ouBsion. They then proceed to test this hypothesis and to accept it or reject it 
with a certain level of oonfidenco. Tho lower tho probability of chance ocourrenoe, 
tho greater is tho certainty of tho experimenter when he rejects the null hypoth** 
esis. In other words^ if he finds his i value in tho 6 in 100 column^ his conclu^ 
sion is that tlie probabilities are B in XOD iliat the groups are alike (or that the 
obtained difference could havo arisen by chance). By the same rcaaoningi if ho 
finds his t value in tho 1 in 100 columui ho rejects the null hypothesis at the 1 per 
cent Icvoli which moans Uiat there is only 1 chance in 100 (as compared with 6 
in 100) that the difforenco could have arisen through chance alone. Qenerally 
speaking, researchers rojoot the null hypotlicsis only when the probabilities are at 
most 5 in too that the diffcrcnco may bo attributed to chance. 

SAMPLE PROBLEM 

Ice-company Employees. — On page 147, Fig. 0-20 shows the relationship bo- 
twoen scores on the Purdue Bfeckanical Adnptabditg Test and tho owner-man- 
ager’s ratings of fourteen mechanics in an ice company. Following are tho values 
that are plotted in that graph, (A rating of 8 ia high, and t ia low.) 



30B 


PRINCIPLES OP PERSONNEL TESTING 


TABLE IX -MINIMUM VAI^UES OP I (THE SIGNIFICANCE 
RATIO) FOR DIFFERENT DEGREES OF FREEDOM RE- 
QUIRED FOR VAIUOUS LEVISL8 OF CONFIDENCE IN 
CONCLUDING THAT TI5BT SCORliS ARE REALLY 
RELATED TO JOB SUCCESS • 


Degrees of 


Probability that obtdncd dilloroncos ocourred through ohaiico 


freedom 

(]V,+N„-2) 

20 in 100 

6 in 100 

2 m 100 

1 in 100 

1 in 1,000 

1 


12.700 

31.821 


030.010 

2 


4.303 

0.965 


31.508 

8 


3.182 

4.541 

6.841 

12.041 

1 

1.633 

2.770 

3.747 

4.004 

8.010 

5 


2.571 

3.805 

4.032 

6.850 

6 

1.440 

2.447 

3.143 

3.707 

6.959 

7 

1.416 

2.305 

2.008 

3.490 

5.405 

8 

L807 

2.300 

2.800 

3.355 

6.041 

9 

1.383 

2.202 

2.821 

3.260 

4,781 

10 

1.872* 

2.228 

2.704 

3.109 

4.687 

U 


2.201 

2.718 

3.100 

4.437 

12 

1.366 

2.179 

2.081 

3.066 

4.318 

18 

1.860 

2.100 

2.060 

3012 

4.221 

14 

1.346 

2145 

2.024 

2.077 

4.140 

15 

1.341 

2.131 

2.002 

2.047 

4.073 

10 

1.337 

2.120 

2.583 

2.021 

4.016 

17 

1.883 

2.110 

2.507 

2.898 

8.005 

18 

1.330 

2.101 

2.662 

2.878 

3.022 

10 

1.328 

2.003 

2.639 

2.801 

3.883 

20 

1.326 

2.080 

2.628 

2.845 

3.860 

21 

1.323 

2.080 

2.518 

2.831 

8.810 

22 

1.821 

2.074 

2.508 

2.810 

8.702 

23 

1.810 

2.000 

2.600 

2.807 

8.707 

H 

1.318 

2.004 

2.492 

2.707 

3.746 

26 


2.000 

2.485 

2.787 

3.726 

26 

1.316 

2.050 

2.470 

2.779 

8.707 

27 

1.314 

2,052 

2.473 

2.771 

8.000 

28 


2048 

2.407 

2.708 

8.074 

20 

1.311 

2.046 

2.402 

2.750 

3.060 

80 

1.310 

2.042 

2.467 

2.750 

8.040 

40 

1.303 

2.021 

2.423 

2,704 

8.661 

60 

1.200 

2.000 

2.300 

2.060 

3.400 

120 

1.280 

1.080 

2.358 

2.017 

8.878 

OD 

1.282 

1.000 

2.320 

2.570 

8.201 


« TkU* tx h abtUasd fiom ynfam A Yatii, SfoitiUKa TMa /or Avriculiurol, wtS 

Mtdkal Rtuanh, OUvor A Bojnt /.Imltod, Bdlnbiirgh, by poRniidon of the authon ud puMUhi^.. 















APPENDIX A 


209 


Rfitmff 

h 

4 

4 

4 

5 
8 
8 
8 
8 
a 
a 
a 
a 
1 


Teafc Score 
114 
104 
103 
05* 

106 

101 

04 

08 

88 

08 

80 

86 

81 

74 


Using these valuesj tlie aforementioned steps have been followed below in order 
to determine whetJicr or not this particular test shoulil bo used in selecting 
mechanics for this particular job in the future. 

Eleph — For the purposes of this analysis, mechanics rated 3, 4, or 5 were 
considered good and those rated 1 or a were considered poor. 


8tep%^Nf « 0 

Step 3. — ATff « 807 H- 9 >- 90.7 

SUp 4. — 7, " 9,080.67 - 09.7* - 40.58 

SUpB.-’N, -B 

Afp ^ 423 -I- 5 - 84.4 

Vp - 7,106.40 84.4* - 43.04 

Step 6. Degrees of freedom ^ 12 

(0 + fi “ 2) 

Step 7. - - 00.7 -- 84.4 = 16.3 

sup 8. -fl, X - 448.22 
6. 7p X AT, - 216,20 
e. Sum » 661.42 
A Divided by 12 ^ 65,12 
s. IV, + ATj. « 14 

A NgXNp^4G 

g. Sum in line o divided by product in lino / ** 0.31 

h. Quotient g times quotient d ■“ 17.087 

i. Square root « 4.13 

Step 0. — t 16.8 4.13 » 3.706 



aiO PRINCIPLES OF PERSONNEL TESTING 

10.— Using Table IX, the degrees of freedom vahio of 12 is located in 
tho left column aud the values across tlio table on Uiat lino are 
examined to find the ouo that is the sumo as or next smaller than 
the i value. Note tliat the vahto of 370i5 falls lietwren 3.056 and 
4318. According to tlic iiilo, 3 055 being the next smaller is selected. 
Sinco this vuluo la in ilio 1 in 100 column, tho conclusion is tliat 
tbero is only 1 chance in 100 tliat this dilTcrenco in avemgo scores 
could have occun'ed through clmnco. 

In this particular problem, tho Purdue Mechanical AdaplabilUy Test can bo 
used for selecting icc-compiiny mechanics with a very liigli degree of assuranco 
that it will identify ilioso who will bo considorod better by the owncivmanRgor. 

BIFFEKENCES IN PERCENTAGES 

Tha SigniRcanca of UilFarances in Percentage.— In ilio same fashion as 
outlined in the discussion of means, percentage diltcrcnccs can ariso through 
ohaneo alone. Since many comparisons that tho personnel tester makes are 
comparisons of percentages, a siinple proceduro is outlined liere. Hero again, ox* 
tensive discussion is impossible and only a simpliried step-by-step prooesa ia 
presented. 


TAimn X 



Low 

oritorion 

group 

High 

oritorion 

group 

All 

Above 
critioal 
score , « . 

a 

b 

a + h 

Below 

critical 

score... 

c 

d 

e + d 

All,.,. 

fl + c 

h + d 



Tablb XI 



Low 

oritorion 

group 

High 

oritorion 

group 

All 

Above 

critical 

RGoro,,, 

0 

43 

43 

Below 

critical 

score.,. 

35 

21 

67 

All 

30 

04 

kxT^ 


PROCEDURE FOR EVALUATING DIFFERENCES BETWEEN 
PERCENTAGES 

Step 1^-SeIoct a orltiool aoore on tho test, and prepare a tablo like Table X, 
placing in space a tho percentago of the whole group who wore in 
tho law criterion group and who were above tho critical score, Tho 
values in cells a, b, c, and d diould total 100 per cent. 

Step 3.^Eiiter tho appropriato values Jn tho ealto labolod a + o^b + d, a + 

and c + d. As indicated each of these values is the sum of the other 
two values in tho some row or column. 



























APPENDIX A 


211 

8tepB.—NoYr multiply aX d, and subfa-act tho product from the product of 
cXb. 

4.— Multiply a + c by b + d and also a H- 6 by c + d. Multiply these 
two products. 

Step S^Now take the square root of the product obtained m Step 4. 

Step Od--Divide tho value obtained in Step 3 by the valuo obtained in Step 6, 
(This is called tlie phi coeffleienL) 

8tep7j^qvLfiro tho value obtained in Step 6, and multiply it by N, the total 
number of omptoyecs. (This la called vh\ square.) 

jS/cpS^Now, using Table XII, locate this value (cAi square) or tho next smaller 
value in tlic table. Note Uio proportion above the value. This is the 
probability of chance occurrence. 

TABLE XII.— POn DETERMINING PROBABILITY THAT PEIU 
CENTAGB DIPlfElUSNOE OCGtJIlRBD THROUGH CHANCE 


20 in 100 

6 in 100 

2 in 100 

lin 100 

1 in 1,000 

1.642 

2.706 

8.841 

5.412 

(3.636 


SAMPLE PROBLEM 

Same Data.— The same data utilised above can be used for demonstrating 
this particular motliod. Roforring again to the scattergram in Fig. 9-20, suppose 
that those rated 3, 4^ or 5 are placed in tho high-criterion group and that those 
rated 1 or 2 arc placed in tlio low-criterion group. Suppose that a critical score 
of 95 is established. Table XX shows the resulting poroontages. The remaining 
steps in tlio procedure for evaluating this relationsliip are os follows: 

£((sp2.— Top row, 0 + 43 — 43 

Bottom row, 36 + 21 « 67 
Left-hand column, 0 + 36 — 36 
Right-hand column, 43 + 21 — 64 

Step 3r-(36 X 43) - (21 X 0) = 1,648 

Step 4.— (86 X 64) X (43 X 67) ^ 6,647,104 

Step 6^V6,C4?,rC)4“ « 2,376 

Btep 0^1,648 divided by 2,376 »0B63 

Btsp 7^(OB62) » X 14 « 6.962 














PE1NCIPLS8 OP PBRSONNBh TBBTINQ 


ais 

Slep8 ^ — ^Eeforring to Table XII^ the value next smaller than 5.952 is 5.412 
which ia in tho 1 in 100 column. That ib, the probability that auoh a 
relationahip might have ooourrod through clianco Ib less than 1 in 100. 

Here again, tlicro ia evidence tliat Uie Purduo Alcchanical Adaplability Test 
con be uaed elfcotivcly in tho aclection of ioc-houso mcclmuica in this plant. 

Necessity for Statistical Analy8i8.--An{Uy8cs of thia typo 7nvsl bo made if 
tho testing program is to suecoed, Failuro to use siirGgiiarda of thLs typo liave 
been the cause of most lost-program failures. A program cannot fail when tlie 
batting odds are known, and this is the way in which they am determined. 



APPENDIX B 

FUNDAMENTALS OP TEST 
ADMINISTRATION 


If testing is worth whilcj it is worth doing wclh Test scores are noUiing more 
or less than measures of human performance; and since human performance is 
variablcj scores arc also variable. This means that m order to make comparisons 
among scores evciy effort should be inado to sec that the conditions under which 
testing is done are tho best and that they are uniform. This section is intended 
to help in the attainment of these goals. 

The Applicant and Testing. — Altliough recent military service has given 
many applicants experience in taking tcstsi tho fact remains that employinent 
testing IS relatively new in imlustry. ISvcry applicant expeofca to fill out an 
application blank and to bo interviewed, but not everyone expects to take tests. 
Furthenuoro, many applicants carry over a disliaict dislike for “cxaininutions’’ 
fiom tho school situation to tho employment situation. Particularly when the 
applicant feels insecure or otherwise is not confident regarding his ability, ho is 
apt to oxporieuco a mental block, The following suggestions sliould bo helpful: 

1. Do ovorytliing possible to put the applicant at case. Make him feel that 
ho is not ''on tho spot.*' Explain that tests along with application blank 
and interview data help prevent misplacements where the cinployec is apt 
to bo unhappy and unsatisfactory. 

2. Avoid creating tlio impression that selection or rejection is solely dependent 
upon passing tho tests. Explain that the whole picture counts. 

3. Never discuss tho tests ahead of time or tell him anything about what they 
nioosuro, 

4 . Avoid tho U60 of tho term examination, "Employment tests," "occupational 
tests," or "poraonnci teats" arc bettor than "tests" alone. Avoid such terms 
os peraonahty, temperament, and intellioenco, 

5. Avoid praoticQB wi^ minority groups that can be construed to bo discrimi- 
natory. 

The Applicant and Test Results.— Whether or not tho applicant should 
have his test porformanco discussed wiUi him is a knotty problem. The author 
f Gols tliat thoro is no categorical answer but that, generally speaking, tho policy 
should bo contingent upon the personnel availablo for such discussion. If tho 
company has an individual tlioroughly schooled m the psychology of coimsoling, 
such a practico is acccptablo and often desirable. If there ia no such trained 
person, or if tlioro is insufficient time for on adequate and complete interview, the 
task hod boiler not be undertaken. In the event that results are diaousaod with 

213 




214 PRINCIPLES OF PERSONNEL TESTING 

iho applicant, raw scoros or more complicated scores should not bo employed. He 
should be told simply that his pcrfonnancc was average, above average, or below 
average. And finally, he should never be allowed to feel that the testa kept him 
from securing a job. Just as test scores sliould not be flaunted at iho applicant, 
so should they not be broadcasted about the plant. Test acorcs sliould bo con- 
sidered restricted information and should be kept confidential in iho same fashion 
aa medical records. 

The Testing Room. — A aeparato testing room should be available for best 
results. If a room siiccifically for testing is impractical, ono section of a larger 
room should be screened oDf. Tlicm sliould be no ihrougit trafiic and no curious 
employees or applicants watching. The noise level should be low, but complete 
ailcncc is not nccosaary and probably not desirable. Proper lighting and ventila- 
tion should receive attoutiou as well as appropriate tompcrutiirc control ; particu- 
larly in the winter when applicants are apt to be wearing heavy outer clothing 
the temperature should be slightly less than nonnal room temperature. IViblcs 
of standard height and comfortable chairs should be available. Plenty of elbow 
room should be provided Where several applicants are tested at a linie, the 
arrangement sliould be such lliat all can sec and be seen by the examiner. Some 
companies find it desiruble to display a sign in tho waiting room similar to the 
following: 


NOTICE TO APPLICANTS 

Tho Occupational Test forms you are asked to fill out for tho A. B. C, Com- 
pany do not determine whether you will be hired or rejected. The forms arc 
used as part of a slandaid hiring procedure in order lo lielp dotormino tho 
position or job for which you aro best fitted. Tho Occupational Teals you 
will fill out are not ones on which you "pass'’ or "fail"; they aro only stand- 
ardised measures of interests and abilities. 

Tho A. B. C. Company 
Personnel Department 

Bquipment^Tlie examinor should provide each applicant with two sharp 
pencils and should have nn extra supply available. He should have two stop 
watches or ono stop watch and a clock A wall clock is not recommended, since 
it tends to make some applicants nervous. Tho administrator should have a 
desk with sufficient drawers or otlicr compartments for now and used tests. 
These sliould not bo visible to tho applicant. Tlic usual precautions should bo 
employed to prevent applicants from carrying pencils and tests away. 

Tho keeping of a dai^ logbook is recommended. Daily activities should be 
recorded. When new editions of testa aro used to roplaoo old ones, appropriate 
entries should be made. Evci-ything should bo dated. 

The Administrator.— Tho good test ndmiuisti*ator is ono who likes people 
and who has time for them. Ho looks after tlio applicant’s physical comfort and 
is aware of impedimenta such as coats and bundles. He puls tlio applicant at 
ease, docs not hurry or scold Jiim, and generally exhibits a friendly manner. He 
makes a favorable impression on tlio applicant, has a good speaking voice, and 
has no offensive habits or manners. Ho docs not smoke or chow gum while 



APPENDIX B 


215 

administering teats; lio dresses in n consorvaiive fasliion with no fads or outr 
latidish styles* Ho avoids wearing lodge or fmtormty jewelry and displays no 
political insignia. Whether men or women should be used as administrators 
cannot bo answered categorically. Although it is a matter of individual person- 
ality! women if adapted to tlic job frequently arc happier in the job than men. 

Administering the Test.— The administrator slioiild adhere religiously to the 
instructions provided with a particular test. Directions sliould be given exactly! 
and time limits should be observed. In administering papcr-and-pcncil tests the 
following points will be found helpful; 

1. Hold test up first and explain. 

2. Pass out test; again explain instructions with each applicant reading same 
from form. 

3. Ask for questions. 

4. Give practice or trial os per test instructions. 

a. Circulate and help; observe whether each is doing samples correctly* 

5. Give answers to praclico problems; explain if necessary. 

6. Ask for questions; do not violate test with answers though. 

7. Instruct on time! manner of stopping! etc. 

8. Use ^simple langimge; “stop,” “go/* etc. 

0. Help the illiterate or dull person who will fail anyway in order that he docs 
not become cmbiirrasscd; encourage him to finish as much as possible. 

а. Any applicant who emphatically quits or gives up should be pleas- 
antly dismissed without embarrassment to him. 

б. Be friendly; everything happens sooner or later; it*B all in tho day's 
work. ICocp recoil of unusual happenings. 

10. Guard against ony possibility of accusation of discrimination. 

Systematic Di8miB8nl,«-SomG companies have found it profitable to hand 
each applioant a card similar to the following: 

The A. B. C. Company wishes to take this opportunity to Uiank you for tho 
time you have taken and tho interest you have shown in applying for a 
position with us. You have now completed the various stops of your applica- 
tion, TJio employment department will give it thorough study and considcrw 
ation. You will hear from tho company within two days at which time you 
will bo told definitely whether you have been accepted or rojected for cm- 
ploymont. 

Sincerely yours, 

John J. Jones 
Personnel Director 
A. B. C. Company 

Obviously, whotover oommitmont rogardmg notifioBtion is made should be 

adhered to rigorously. . t * • 

Summary,— Tho purpose of this ssotion has bean to touch ths high spots in 
the whole area of odministrotion. If those suggestions arc followed, they should 
keep tho noophyto from serious pitfalls until personal osperienoo has been nooum- 
ulated. 



APPENDIX C 

COMMERCIALLY AVAILABLE TESTS 


Any ficJcctod bibJiogmphy o/ tests is sura to omit tests titat hava been /ouzid 
useful by aomcono m the ludustrml field. Howover^ siucc it is impossiblo to 
present an all-mclnsivo list/ a few fioloctod titles are presented on the following 
pages. Those teats which have been referred to in tlic text and are commercially 
available arc listed. In addition, the list aIbo includes otlior testa tliat have 
already demonstrated their applicability in industry or seem promising to the 
author 

Following are the names and addresses of a number of suppliers and publishers 
of tests. Most of those listed wiil supply catalogues or descriptive material on 
request, Note that each is coded witli a letter and tliab the code letter only is 
used in the list of tests. 

(A) Amcnenn Optical Company 

70 West 40lh iSt. 

Now York, N. Y. 

(B) BauBch and Loinb Optical Company 

635 St. i^aul St. 

Rochester 2, N Y. 

(C) California Test Bureau 

691G Hollywood Blvd. 

Los Angeles 28, Calif, 

<D) Humm Porsonncl Service 

1210 West Twelfth SL 

Los Angeles 16, CnliL 

(E) Keystone View Company 

Mcndvillc, I’a. 

(F) TJic PoychologiGsl Corporation 

522 Fifth Avo. 

New York 18, N. Y. 

(G) Public School Publishing Company 

500-^18 Nortli East St. 

Bloomington, 111. 

(H) Division of Applied Psychology 

Purdue University 

Xiafayetto, Ind. 

^ For a comprehensive list of tests of all kinds see Bukos, Oscau K. (ed.). 
TM 1040 menial mcanyrements yearbook. Highland Fhrki N. J. ; Mental Measure^ 
monte Yearbook, 1041. 074 pp. 


2M 




APPEl^DIX C 


217 


(I) BcioncG UcBcarcIi Asaoclatos, Inc* 

228 South Wabash Avo. 

Chicago 4, III. 

(J) Sheridan Supply Company 

P.O. Box 837 
Beverly Hills, Cahf, 

(K) Stanford ITniveraity Press 

Stanford University, Calif. 

(L) Stevens Institute of Tedinology 

Hoboken, N. J. 

(M) C. H. Stooiting Company 

424 Nortli Homan Avo. 

Chicago, III. 

(N) World Book Company 

Yonkers, N. Y. 

While some tests are difBcult to classify, the groupings set up below will be 
generally meaningful to the reader. Page designations indicate reference to the 
teat in this book. Tlio designation (0) is used for those testa that are privately 
printed or tare not generally available. 

MENTAL ABILITY AND CLASSIFICATION TESTS 
Adaptability Test 

Tiffin and lAWahe (I) 61, 52, 68^74, 171 

Army General Clossificatiou Test (I) 60, 60-01 

Oral Directions Test 
Langmuir (F) 

Otis Sclf-administeidng Test of Mental Ability (N) 67, 131, 134, 

135, 143, 144, 163, 108, 172 

Personnel Test 

Wonderlio <F) 68,162,155,167 

Ihirduo Industrial Training Classificaiioii Test 

Lawsho and Montoux (I) 41-53, 129, 130, 149 

Senior Classification Test 

Preasey (Q) 147, 164, 165 

Senior Verifying Teat 

Presscy (G) 147, 164 

SRA Verbal Classification Form 
Thurabone and Thurstone (I) 

BRA Non-verbal Classification Form 
McMuny and Johnson (I) 


Soovillo Mental Ability Test (F) 153 

MANUAL AND MANIPULATIVE TESTS 

Blum Sowing Maohine Tost (0) 130 

Finger Doxterity Test 

O'Connor (L) 129,134,140,164 



218 TRWCWLEB OF PERBOmBL TESTING 


Hfuid*<tool Dexterity Test 
Bennott (F) 

Minnesota McclmniCAl Aaaombly Test (F) 164 

Miniicaola Hate of Mnnipulaiion Tost 

Ziegler (F) 133, 138, 130 

MiniicRoia Siiatial Itelations Test (F) «... 143, 164 

Purdue Hand Prooision Test 

Tiffin (H) 133 

Purdue Meolianical Assembly Test 

Granoy and Tiffin (H) 142 

Purdue Pegboard 

PiirduG Rceoaroh Foundation (I) 

Stenquist Mochanical Assembly Test (M) 147, 148 

Tweezer Dexterity Test 

O^Connor (L) 120, 154 

Western Elcotrio Form Board (0) 134 

UECHANICAL APTITUDE OR COMPREHENSION TESTS 

MaoQimrrie Test for Mccbnnieal Ability (C) 130, 181, 144 

Purdue Mcolianicnl Adaptability Test 

Uwshc and Tiffin (H) 120, 130, 142, 140, 140 

Revised Minnesota Paper Form Board 

Likert and Qunslui (F) 133, 184, 143, 144, 172, 174, 176 

SRA Tests of Mochanical Aptitude 
Richardson ct oi. (I) 

Test of Mechanical Comprehension 

Bennoit (F) 141, 143, 144, 147, 172 


STENOGRAPHIC AND CLERICAL TESTS 

Blaokstone Stenographio Frolicionoy 
Tost (F) 

Kimberly-Clark Typing Ability Analysis 

Jurgensen (1) 166 

Minnesota Vocational Test for Clerical Workers 

Andrew (F) 138, 152, 164, 160, 167 

Purdue Clerical Adaptability Test 
Moore, Lawshe, and Tiffin (H) 

SRA Test of Dictation Skill 

Hidiardson and Fodcracn (I) 

SRA Test of Typing Skill 

Richardson and Pedersen (1) 

Stenographio Aptitude Tost 
Bennett (F) 

Stenographic Prondoney Testa 


Seashore and Bennett (F) 166*-160 

Thurstoue Examination in Typing (F). 165 



APPBNDIX C 


219 


TBMPEHAMBKT AND PERSONALITY TESTS 
AflccndancoHSttbmiBsion Test 

Afiport (F) ^ ^ ^ go 

E 0 II Adjustinent Inventory (K) 181 

Claasificntion inventoty 

Jurgonsoti (0) 86,167 

Hiiiiim- Wadsworth Tempcramotii; Scale (D) 81-83 

Introvcrsion-exirovoiBion Tost 

Root (0) 103 

laventory of Factors GAMIN 

Guilford and Marlin (J) 83 

Invcntoiy of Factors STDCR 

Guilford ( J) 83 

Minnesota Multiphasio Fersonahty Inventory 
Hathaway and McKinley (F) 

Personality Inventory 

Bomreuter (F) 38, 35| 163, 167 

PoTBonncl Inventory I 

Guilford and Marlin (J) 83-85 

llovision of A-S llcaolion Study for Business Use 

Bookman <F) 85-86, 163 

Social Adjustment Inventory 

Waslibumo (F) 131-138 

INTEREST TESTS 

Kuder Profercnco Roconl (1} • 88, 89-91, 95 

Occupational Interest Blank for Women 
Manson (F) 

Vocational Interest Blank for Men 

Strong (lO 87, 01-04, 166, 167 

Vocational Interest Scliodulo 
Thurstono (F) 

TRADE TESTS 

Can You Road a Micromotor? (Piirduo Interview 
Aids Series) 

Lawslio (I) * 128 

Can You Read a Scale? (Purdue Interview Aida 
SorieB) 

Lawshc (I) 

Gan You Read a Working Drawing? (Purdue 
Interview Aids Sorics) 

Lawsho and Lindahl (I) 128 

Practical Electrical Information Test 

Laweho 18® 



220 FniNClPLEB OP PERSONhfEL TESTING 


Purduo Bhicprmt Beading Teat 

Owen and Arnold (I) 127 

Ftirduo Test for Elcctrioinna 

Caldwell el aU (1) 127 

Purduo Tost for Maohinists and Maohinc Operators 

Owen cl al, (1) 127 


VISUAL SKILL TESTS 


Ortho^Rftter (B) 102-U0 

Sight Screcnor (A) * 102-103 

Tolobinooular (E) 102-103 


ARITHMETIC AND MATHEMATICS TESTS 

Arithmetical Rcafioning Test 
Cardan <I) 

Clapp-Young Aiiihmctic Teat (F) 

Purdue Indualrial Mathematics Teat 
Lawshe and Price (H) 

Soliorliag-GiArjc-Fottcr Arithmetic Teat (F) 

VOCABULARY TESTS 

EngUah Vocabulary Test, Work Bnmplo 06 
O’Coimor (L) 

English Vocabulary Teats 
Markhnin (Q) 

Michigan Vocabulary FroAlo Test 

Greene (N) 174, 176 

Wide Bango Vocabulary Tout 
Atwell and Wells (P) 

MISCELLANEOUS TESTS 


Aptitudo Index (0) 165 

How Supervise? 

File and Rommera (F) 178 


Stanford Scicntifio Aptitudo Test 
Zyvo <K) 

Teat of Fraoticai Judgment 
Cardall (1) 



INDEX* 


A 

Abaoiflsa, 42 

Abaonteeifinii records of, 8 
08 criterion, 23 
Accident pronencas, 118 
Accidents, 08 criterion, 23-24 
records of, 8 

as related to vision, 118-119 
Accounting nmuhine servicemen, 93-94 
Acuity, distonco, 104-100 
near, 106 

worse eye, 100-108 
(See also Viaioa tcala) 

Adding mnehme operators, 167-158 
Administration of tests, 214-216 
Advertising men, 91-92 
Age factor, control of, 88 
visual changes with, 98-101 
Agrccablcncsa component, 70 
Aircraft mcchanio learners, 147 
Aircraft riveters, 134 
American Optical Co., 210 
American Psychological Association, 
102n. 

Anderson, Hodwin C., 89 
Andrews, Amy C,, 141-142 
Applicant, and teat results, 213-214 
treatment of, 213 
Apprentices, 73 
machinist, 143-144 
mixed groups, 149-160 
printing pressmen, 148 
Arakelian, Peter, 145-140 
Arithmetic tests, 220 

* A complete index of test 
C, pages 217-220. 


Armstrong, T, 0., 153-1B4 
Ascendance-submission component, 78 
Asacmblers, 73 
electrical fixture, 130-131 
glove, 131 
radio, 130-131 
Auiistio component, 77 
Averages, cumulative, 46-47 
mctliod of, 44-47 
simple, 44r45 
successive, 45-40 

B 

Bar-mill employees, 139 
Bausch and Lomb Optical Co , 103, 216 
Bench workers, 136-137 
Benge, Eugene J., 132 
Bennett, George K , 123-124, 141 
Bills, Marion A., 68, 04, 93, 153, 166-166 
Blaoksmitlis, 148 
Blum, Milton L., 129-131, 160 
Bookkeeping machino operators, 156- 
167 

Budget, for testing, 104 
Buros, Oscar K., 14n., 2I0n. 

C 

Cable formers, 136 
Calculating machine operators, 168 
California Test Bureau, 216 
Candoe, Bcatrioc, 130, 160 
Card-punch-machine operators, 157 
Card-stacking method, 29-32 

the book appears ia Appendix 


titles referred to in 

m 




INDEX 


222 


Carpentera, 148 
Ooahior checkers, 3-4 
Cnahinra, lOS-lQO 

Onaimtty-msurAnco flAlosmonp 03, 165- 
106 

Cattell, Hoymond B., 66n. 

Chance diffcroncos, 206-206 
Chi squnro, 211 
Clarke, Walter V., 168 
ClAasification tcaLe, 217 
Clcetoii, Glen V., 176-176 
Clerical employees, 60 
tests for, 218-210 

Clerioal work, and mental ability, 153 
nature of, 151-152 
routine, 162-163 
test batteries for, 153-164 
Coil winders, 134-135 
Coleman, J. H , 104^106, 114 
Collins Radio Company, 127 
Color factor, control of, 88 
Color vision, 160-110 
Company policy, 20-21 
on personnel testing, 103-107 
Gonipononla of tompommonfc, 77-70 
Confideiico level, 207 
Cook, D. W„ 134-137, 140, 154-166 
Gooperativoness, 70 
Correlation, cocflicionfc of, 42-44 
Cotton-niill-machino fixers, 147-148 
Crlssey, Orlo L., 143 
Criteria, oceidont data as, 110-120 
in ollice jobs, 162 
Criterion, external, 181-187 
of inlornAl consistency, 188 
Criterion groups, 16-17 
selection of, 20-30 
Critical scores, 201 
Cniikahank, Ruth M., 123-124 
Cycloid disposition, 78 

D 

2)-valuo (disariniination value), 186- 
187 

Deduotivo ability, 66 
Departmcnbflloro paokors, 109 
DepBrtm6nli>«torQ salespetsons, 6 


Depressivo component, 77-78 
Dial swltdimen, 145-140 
BilTerencos, and olianco ocourronce, 206 
200 

Discrimination value of tost items, 186 
187 

Ditto-machine operators, 157 
Dodge, Arthur IF., 167-168 
Dorous, Roy M., 82-84 
Drill-press operators, 116 
Dvorak, Beatrice, 164, 175 

E 

Earle, F. M., 148 
Earnings, ns criterion, 22 
Electric soldercra, 112-114 
Electrical-fixturo Assemblers, 130-131 
Elcctrioal inspectors, 146 
Electrical test technician, 127-428 
Elcctrioal testers, 146 
Elootrical traincos, 10-11, 60-70 
Eleolrical troubicnien, 144-145 
Electricians, 148 
Engineering operatives, 141-142 
Engineers, 17^174 
Environment and heredity, 
as determiners of difforonccs in abil-* 
ity, 6-7 

relation to mental abilityi 56-57 
Epileptoid component, 77 
Ewart, Edwin, 27n. 

Executive personnel, 176-176 
Executives, 73 
External oritorion, 184-187 
Eye care, job improvement tlirough, 101 
referral for, 120 
Eyewear, safety, 121 

V 

Falsinoation on tests, 80-81 
Fear, Richard A., HI 
Fcinbcrg, Richard, 121 
File, Quentin W., 173 
Fisher, R, A., 207n. 

Fittora, 148 

Fixed gauge inspeoUus, 73 



inmx 


223 


Follow-up motbodi 16-10 
advantages of, 10 
Food oimnoiiBi 132 

Forced dislribution, in ratingi 8S-64 
Forlano, George, 131 
Fox, John B., 7n., 0, 23 
Freiglit solicitors, 88 

0 

Gage, N. L., 180, 182 
Geiso, W, J., 97 
General activity, 78 

Ohiselli, Edwin E., 132-133, 167-168, 166 
Glove asseinblcra, 131 
Goodman, Charles H., 130-131 
Gottadanker, Robert M., 168 
Greenly, R. J., 130 
Grocery cheokora, 8 
Guidance^ testa in, 94-95 
Guilford, J. P., 77-79, 83 

H 

Habits, 76-77 
Hall, Milton 0., 148 
Halo eiTcot, in ratings, 27-28 
Hardtko, E. F., 134 
Harrell, Margaret S., 68 
Horroli, Thomas W., 68 
Harrell, Willard, 147-148, 171-172 
Hay, Edward 1^., 166-167 
Hayes, Eleanor G., 134, 137, 138 
Hayes, M. H. S., 04 

Heredity and environment, as deter- 
miners of diiTcrcncea in ability, 0-7 
relation to mental ability, 60-^7 
Hosiery loopcrs, 12-13 
Hovland, Carl I., 68n. 

Humin, Doncaster G , 77, 81-83 
Humm Personnel Service, 216 
Hystoroid component, 77 

I 

lee-company meciwnios, 146-147, 207- 
212 

Inductive ability, 66 


Inferiority feelings, 78 
Insi^eotors, 36-36, 73, 133 
electrical, 146 
piston rings, 116 
Intelligence, 66 

Intelligence quotient (I.Q.), 67 
Interest tests, 87-96, 219 
limitations of, 94-06 
in mechanical jobs, 124 
and supei-visory jobs, 172-173 
Internal consistency, os criterion, 188 
Introversion-extroversion, social, 78 
thinking, 78 
Item validation, 88-89 

J 

Jacobsen, Eldon E , 147 
Job analysis, 14, 200 
Job olasaliication, in clerical jobs, 169- 
161 

Job difTerences, in ratings, 28-20 
Job families, visual, 117-118 
Job-knowledge testa, 126-129, 210-220 
OS aptitude tests, 128-129 
oral, 126-126 
outline for, 178-180 
and upgrading, 127-128 
written, 126-127 

Job level, as criterion, 162, 160-161 
Job samples, 34-37 
Job sati^actiou, 63-64 
Johnson, Beatrice R., 93-04, 107 
Judgment, as cidterion, 24-34 
pooling of, 34 

Jurgensen, Clifford E., 138-139, 166n., 
107 

K 

Kepharfc, Newell C., 101 
Keystone View Co , 216 
Kirkpatrick, Forrest H., 131 
Kuhn, Hedwig S., 09, 120, 121n. 

Kurts, Albert K., 164—166 

L 

Labor unions, attitude toward tesUng, 
108-109 



2NI>EX 


224 

Laingi Donald M<, 68 
Lniindry-supply salcsmoni 166 
Laundry workors, 138 
LnwRhOi C. H , Jr.| lOn.^ 35-3Qj 
Cfln., 123n., 129rt., 142, 14&.147, 149, 
186rt., 187 

Learners, nircrafL mechanics, 147 
Learning period, and test scores, 17 
Learning time, ns criterion, 23, 37- 
38 

Leodko, Hazel N., 118-110 
Length of service, na oritcrion, 23 
and mental ability, 60-63 
Level of confidence, 207 
Life Insurance Sales Ucscarch Bureau, 
164-166 

Life-insurance salesmen, 86-86, 92-03, 
102-165 

Life OiHco Management Association, 
162, 158-161 

Lindquist, K F., 42n., 200?i. 

Long, 123n. 

Loopein, hosiery, 12-13 

M 

Maoliino accounting raoihods salesmen, 
167 

Machine beokkeepors, 166-167 
Machine spocialists, 73 
Machine tool operators, 140-142 
Machine tool work, 139-140 
Malcontents, 84 
Management reports, 201-202 
Managers, braneli-oliico, 67 
potential, 73 
Manic component, 77 
Manipulative ability, measurement of, 
123-124 

Manipulative tests, 217-218 
Manual tests, 217-218 
Martin, Howard G., 83 
Masculinity-femininity, 78 
Mason, Clmries W., 176-170 
Mathematics tests, 220 
Mechonioal ability, measurement of, 
123-124 

teats of, 124, 218 


Mechanical aptitude, inoasuromont of, 
123-12-1 
tests of, 218 

Mcclmiiical comprohension, measure- 
ment of, 124 
tests of, 218 

Mechanical information, incosuremont 
of, 12-1 

Mechanics, ico-compony, 146-147, 207- 
212 

Memory ability, 66 
Mental abilities, 64r-66, 168-169 
Mental ability, and clerical jobs, 168, 
168-150 

and job placement, 68-66 
for stenographers, 165 
and supcrvisoiy jobs, 171-172 
tests of, 217 
Mental fluency, 65 
Mental Mea3urement8 Yearbook, 14 
Merit rating systems, 24-27 
Mi11ing-<niaduiio operators, 114-116 
Miiltiple>*ehoico queslions, 182 
Musscr, Wayne, 28n. 

N 

Naval electrical trainees, 10-11, 69-70 
Nervousness, 78 
Nowbold, 13. M., 7n, 

Normal component of temperament, 77 
Numerical ability, 65 

0 

Objectivity, 70 

Occupational Analysis X^aboratory, Pur- 
duo University, INn. 

Occupational groups, dilferenccs be- 
tween, 80-01 
OflicQ clerks, 60, 73 
testa for, 218-210 

Oflico oquipinont BalcRmcn, 166-167 
Ofllco macliino opcrnlora, 166-168 
Ofllce managers, branch ofllce, 67 
potontial, 73 

Ofllco work, naturo of, 16M62 
Ohmann, 0. A., 22-23 



INDEX 


226 


Oporatorai nosortlng room, 183-134 
bookkeeping machine, 166-167 
oaiculAting niBciifnc, 168 
oard-pimcli-maohino, 167 
ditto-machine, 167 
driil-pross, 116 
machine-tool, 140-142 
milling-maoluno, 114-116 
nonnutomatio machincB, 73 
oflicc-machino, 16G-168 
papor-converting-maohino, 13&-13Q 
pimdi-prcss, 138 
simple machinea^ 73 
Ordinate, 40 

Organissed labor and testing, 108-100 
Ortho-Rater, 102-122 
Ostcrmick, Ralph E., 162-163, 166^167 
Otis, Jay L., 166 

P 

Packers, 73 
department store, 100 
pknrmaccutioal, 132-133 
Piiper-convorliiiR-machino operators, 
138r-130 

Paranoid component, 77 
Patten, Everett F., 140 
Percentages, onmulative, 40-60 
method of, 47-11% 
simple, 47-48 
BuccesfdvQ, 48-40 

Percentile scores, meaning of, 60-51 
Perceptual speed, 55 
Personality measurement (see Temper- 
ament) 

Personnel data, 23-24 
Pharmaooutioal packers, 132-133 
Phi coefficient, 211 
Phoria, 108-109 
lateral, 104 
vertical, 104 
Fiokler-lcarncrs, 71-72 
Piston-ring inspectors, 115 
Policy, company, 20-21 
personnel-testing, 103-107 
P6nd, Millicent, 68-60, 62-63, 64n., 243- 
144, 163-164 


Present employee method, 13-16 
advantages of, 18-19 
Primary mental abilities, 64-65, 168-169 
Printing pressmen apprentices, 148 
Private secretaries, 73 
IVoblcm employees, and 1, 66 

Production data, 22-23 
characteristics of, 6-6 
Profile, group, 62-63, 00 
individual, 51-^2 
method, 60-63 
vision, 104-106, 110-118 
Promotion, as criterion, 04-66, 252 
P^chological Corporation, 216 
Public Scliool Publishing Company, 
216 

Punch-card-machine oporatoi-a, 167 
Punch-press operators, 138 
Purdue Research Foundation, 127 
Purdue University, Division of Applied 
Psychology, 117n., 127n., 216 
Occupational Analysis Laboratory, 
117», 

Q 

Quality, as criterion^ 22, 162 
Quality of work, 2 
Quantity, as criterion, 22, 162 
Questions, difficulty of, 83 
discrimination value of, 186-187 
preparation of, 182-183 
types of, 180-182 
validity of, 183-188 

R 

Radio assemblers, 130-131 
Radio tube mounters, 131 
Rankin, Bernard, 127n. 

Ranking method, 20, 33-34 
Rate of advancement, as criterion, 23 
Ratings, as criteria, 24-34 
forced distribution in, 32-34 
halo effect in, 27-28 
job differences in, 28-20 
Relay adjustefs, 236 
Remmers, H. H., 173, 180, 182 



INDEX 


226 

Roporte to managcmcnti 201-202 
Revision of tostj 188-100 
Rlmtliymiai 78 
Rivertorsi nii'cmft, 134 
Roberts, Willmtn H., 162-163, 166-167 
Rogers, H. B., 34-36, 133 
Room, conditions of, for testing, 214 
Rosaiio0, A. S., 77 
Rofionatoin, J. L., 86, 162 
Roes, Lawrence W , 140-141 
Rossett, Nathaniel E., 146-140 
Rtich, Floyd, 202 
Ryan, T. A,, 03-04, 167 

8 

Safety, ns criterion, 110-120 
Salary, as criterion, 162 
Sales criteria, 22-23 
Salesmen, 73, 86 
casualty-insurance, 03, 106-166 
laundry-supply, 106 
lifc-insuranco, 86-^, 92-03, 102-105 
maoluno-accoimting-mothods, 03, 167 
oflico-cquipmont, 106-167 
Salesporsnns, retail-store, 6, 73, 107-168 
Sampling in testing, 177-178 
Sarbin, ThcodoTO H., 89 
Snrtain, A. Q., 172 
Scattci^mm, 40-42 
Sohulls, Rioliard S,, 86-86, 162-103 
Schwab, R» E., 174 

Soionco Hcscaroh Associates, Inc*, 217 
Soott, Jerome F., 7n., 23 
Scott, W. D., 64 

Screw-manufacturing employees, 142 
Scashoro, B. E., 27n. 

Socrctarios, private, 73 
Somanek, Irene A., 12011., 142, 146n., 
140n 

Servioemen, accounting maobino, 03-04 

Setup men, 73 

Sowell, John C., 121 

Sox factor, control of, 38 

Shartle, Carroll L , 126, 131, 144-146, 102 

Sheet-metal trainees, 134 

Shepard, G. F., 06n. 

Sheridan Supply Co., 217 


Shorthand transcription tests, 156-160 
Sliumnu, John T., 134, 143-144, 172 
Srglit-Screcnor, 102-103 
Signincaucc of di0crenco, 206 
between means, 200-210 
between percentages, 210-212 
StgiiiBcnnce ratio, 208 
Single variable, 37-38 
Snellen chart, 102 
Soldorcrs, 13^136 

Spatial relations, measurement of, 124 
Standard error, 207 
Standards chcekora, 73 
Stanford University Press, 217 
Statistical Analysis, 212 
Stead, William IL, 125 
Stenographers, 165-166 
tests for, 21^210 
Stevens, Samuel N., 07 
Stevens Inslituto of Technology, 217 
Steward, Vorno, 163--101 
Stoclting, C. II., Co., 217 
Store cashiers, 108-160 
Storo salespeople, 6, 73, 107-168 
Storckoopers, 73 

Strong, Edward K., Jr., 87, 01-03, 102, 
176 

Stump, Frank N., 110, 113-114, 118 
Siiporvisors, 72-74, 76 
getting support of, 107-108 
tests for, 171-173 
Sworts, B. K., 174 

T 

/-ratio, 208 
Technicians, 73 
Tclcbinocular, 102-103 
Teletype trainees, 70-71 
Temperament, 76-80 
as factor in rating, 24-26 
Temperament tests, 76-86, 210 
for salesman selection, 167 
for supervisory scleoUoii, 172-173 
Ton Broook, Dolphino L., 138 
Tenure, as criterion, 60 
Test administrator, 214-215 
Tost budget, 180 



INDEX 


227 


Test questions, difllculty of, 183 
cliBcriminaiion value of, 185 
propamtioii of, 182-188 
types of, 180-182 
validily of, 183-188 
Teat revision, 188-100 
Test auppIioTs, 210 
Testing os sampling, 177-178 
Testing equipment, 214 
Testing room, 214 
Tests, administration of, 214-215 
Qommorcially available, list of, 216- 
220 

establishing time limits for, 100-101 
limitations of, 10-11 
Thompson, Loiiu A., Jr», 125 
Thornton, G. H., lOn., 40n. 

Tluirstonc, L L., 54 
Tiilin, Joseph, 2n., 12, 27n., 28, 34-80, 
OOn.,- 70, 73, 07, 00, 100-107, 109, 
115, 120-130, 133, 137, 142, 146-147, 
140-150 

Time, as criterion, 22 
Time limits for tests, 100-101 
Todd, George L., 10^107 
Tool setters, 143 
Toolroom employees, 2 
Trade tests, 125-120, 219-220 
os aptitude tests, 128-120 
oral, 125-126 
outline for, 178-180 
and upgrading, 127-128 
written, 120-127 
Tradesmen, 73 

Trainees, machine tool, 140-142 
sheet metal, 134 

Training time, as criterion, 23, 152 
Transcription tests, 165-150 
Trial battery, administration of, 15-16 
True-false questions, 181-182 
Typing tests, 165 
Twists, 164-166 


U 

Union support, need for, 15-16 
UmoQs, attitude toward testing, 108- 
100 

U. S« Employment Service, 125, 167 
Upgrading and testing, 100 

V 

Validating techniques, 13-18 
Validation of items, 88-80 
Validity, need for establishing, 11-12 
of test items, 183-188 
Variability, 200 
Verbal ability, 66 
Vision, compleiuty of, 96 
Vision tests, 00-122, 220 
dmical, 101-102 
nonclmical, 101-103 
validation of, 103, 116-117 
Visual job families, 117-118 
Visual postures, 06-07 
Visual skills, 00-07 
relationship between, 97-08 
Visualizing ability, 65 
Vitelcs, Morris S., 168-100 
Vocabulaiy tests, 220 

W 

Wadsworth, Guy W., Jr., 06-66, 81-83 
Watch assemblers, 120-1^ 

Whituey, Ilolland L , 121 

Wirt, S. E„ 12«., 97, 106, 109, llln., 118 

Wondcrlic, Eldon F., 67-08 

Wondcrlio Personnel Test, 68 

Wool pullers, 4-5 

World Book Company, 217 

Wright, James H., 68 

Y 

Yates, F-, 20771 . 



